{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/mrdbourke/pytorch-deep-learning/blob/main/extras/pytorch_2_intro.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\n",
    "[View Source Code](https://github.com/mrdbourke/pytorch-deep-learning/blob/main/extras/pytorch_2_intro.ipynb)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# A Quick PyTorch 2.0 Tutorial"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Notebook last updated: 2023-04-14 15:24:37.007274\n"
     ]
    }
   ],
   "source": [
    "import datetime\n",
    "print(f\"Notebook last updated: {datetime.datetime.now()}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "## 30-second intro\n",
    "\n",
    "PyTorch 2.0 is out!\n",
    "\n",
    "With the main improvement being speed.\n",
    "\n",
    "This comes via a single backwards-compatible line.\n",
    "\n",
    "```python\n",
    "torch.compile()\n",
    "```\n",
    "\n",
    "In other words, after you create your model, you can pass it to `torch.compile()` and in turn expect speedups in training and inference on newer GPUs (e.g. NVIDIA RTX 40 series, A100, H100, the newer the GPU the more noticeable the speedups).\n",
    "\n",
    "> **Note:** There are plenty more upgrades within PyTorch 2.0 than just `torch.compile()` but since it's the main one, it's what we're going to focus on. For a full list of changes, see the [PyTorch 2.0 release notes](https://pytorch.org/blog/pytorch-2.0-release/).\n",
    "\n",
    "### Will my old PyTorch code still work?\n",
    "\n",
    "Yes, PyTorch 2.0 is backwards-compatible. The changes are mostly additive (new features).\n",
    "\n",
    "That means if you already know PyTorch, such as via the [learnpytorch.io](https://learnpytorch.io) course, you can start using PyTorch 2.0 straight away. And your old PyTorch code will still work."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Quick code examples"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Before PyTorch 2.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import torchvision\n",
    "\n",
    "model = torchvision.models.resnet50() # note: this could be any model\n",
    "\n",
    "### Train model ###\n",
    "\n",
    "### Test model ###"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### After PyTorch 2.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import torchvision\n",
    "\n",
    "model = torchvision.models.resnet50() # note: this could be any model\n",
    "compiled_model = torch.compile(model) # <- magic happens!\n",
    "\n",
    "### Train model ### <- faster!\n",
    "\n",
    "### Test model ### <- faster!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Speedups\n",
    "\n",
    "Ok so the focus of PyTorch 2.0 is speed, how much faster is it actually?\n",
    "\n",
    "The PyTorch team ran tests across 163 open-source models from [Hugging Face Transformers](https://huggingface.co/docs/transformers/index), [timm](https://github.com/huggingface/pytorch-image-models) (PyTorch Image Models) and [TorchBench](https://github.com/pytorch/benchmark) (a curated set of popular code bases from across GitHub).\n",
    "\n",
    "This is important because unless PyTorch 2.0 is faster on models people actually use, it’s not faster.\n",
    "\n",
    "Using a mixture of AMP (automatic mixed precision or float16) training and float32 precision (higher precision requires more compute) the PyTorch team found that `torch.compile()` provides an average speedup of 43% in training on a NVIDIA A100 GPU.\n",
    "\n",
    "Or 38% on timm, 76% on TorchBench and 52% on Hugging Face Transformers.\n",
    "\n",
    "<img src=\"https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/extras-pytorch-2-speedups.png\" alt=\"speedups for PyTorch 2.0 across various model resources\" width=650/>\n",
    "\n",
    "*PyTorch 2.0 speedups across various models from different locations. *Source:* [PyTorch 2.0 announcement post](https://pytorch.org/get-started/pytorch-2.0/).*"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3-minute overview\n",
    "\n",
    "> **Note:** The following is adapted from [*A Quick Introduction to PyTorch 2.0*](https://www.mrdbourke.com/pytorch-2/) on mrdbourke.com, there's also an accompanying [video explainer on YouTube](https://youtu.be/WqLKfta5Ijw).\n",
    "\n",
    "What's happening behind the scenes of `torch.compile()`?\n",
    "\n",
    "`torch.compile()` is designed to \"just work\" but there are a few technologies behind it:\n",
    "* TorchDynamo\n",
    "* AOTAutograd\n",
    "* PrimTorch\n",
    "* TorchInductor\n",
    "\n",
    "The [PyTorch 2.0 getting started notes](https://pytorch.org/get-started/pytorch-2.0/) explain these in more detail but from a high level the two main improvements `torch.compile()` offers are:\n",
    "* Fusion (or operator fusion)\n",
    "* Graph capture (or graph tracing)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Fusion\n",
    "\n",
    "Fusion, also known as **operator fusion** is one of the best ways to make deep learning models go brrrrrr (brrrrrr is the sound your GPUs fan make when your models are training).\n",
    "\n",
    "Operator fusion condenses (like Dragon Ball Z) many operations into one (or many to less).\n",
    "\n",
    "Why?\n",
    "\n",
    "Modern GPUs have so much compute power they are often not compute limited, as in, the main bottleneck to training models is how fast you can get data from your CPU to your GPU.\n",
    "This is known as bandwidth or memory bandwidth.\n",
    "\n",
    "You want to reduce your bandwidth costs as much as possible.\n",
    "\n",
    "And feed the data hungry GPUs with as much data as possible.\n",
    "\n",
    "<img src=\"https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/extras-memory-bandwidth-output-small.gif\" alt=\"example of memory bandwidth costs transferring data on and off the GPU\" width=950/>\n",
    "\n",
    "So instead of performing an operation on a piece of data and then saving the result to memory (increased bandwidth costs), you chain together as many operations as possible via fusion.\n",
    "\n",
    "A rough analogy would be using a blender to make a smoothie.\n",
    "\n",
    "Most blenders are good at blending things (like GPUs are good at performing matrix multiplications).\n",
    "\n",
    "Using a blender **without operator fusion** would be like adding each ingredient one by one and blending each time a new ingredient is added.\n",
    "Not only is this insane, it increases your bandwidth cost.\n",
    "\n",
    "The actual blending is fast each time (like GPU computations generally are) but you lose a bunch of time adding each ingredient one by one.\n",
    "\n",
    "Using a blender **with operator fusion** is akin to using a blender by adding all the ingredients at the start (operator fusion) and then performing the blend once.\n",
    "\n",
    "You lose a little time adding at the start but you gain all of the lost memory bandwidth time back."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Graph capture\n",
    "\n",
    "Graph capture I’m less confident explaining.\n",
    "\n",
    "But the way I think about it is that graph capture or graph tracing is:\n",
    "\n",
    "* Going through a series of operations that need to happen, such as the operations in a neural network.\n",
    "* And capturing or tracing what needs to happen ahead of time.\n",
    "\n",
    "Computing **without graph capture** is like going to a new area and following GPS directions turn by turn.\n",
    "\n",
    "As a good human driver, you can follow the turns quite easily but you still have to think about each turn you take.\n",
    "\n",
    "This is the equivalent to PyTorch having to look up what each operation does as it does it.\n",
    "\n",
    "As in, to perform an addition, it has to look up what an addition does before it can perform it.\n",
    "\n",
    "It does this quickly but there’s still non-zero overhead.\n",
    "\n",
    "<img src=\"https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/extras-graph-capture.gif\" alt=\"Graph capture\" width=\"950\"/>\n",
    "\n",
    "*Example of graph capture, mapping out the steps in a neural network and then capturing every operation that needs to happen ahead of time.*\n",
    "\n",
    "Computing **with graph capture** is like driving through your own neighbourhood.\n",
    "\n",
    "You barely think about what turns to make.\n",
    "\n",
    "Sometimes you get out of the car and realise you can’t remember the last 5 minutes of the drive.\n",
    "\n",
    "Your brain was functioning on autopilot, minimal overhead.\n",
    "\n",
    "However, it took you some time upfront to remember how to drive to your house.\n",
    "\n",
    "This is a caveat of graph capture, it takes a little time upfront to memorize the operations that need to happen but subsequent computations should be faster.\n",
    "\n",
    "Of course, this is a quick high-level overview of what’s happening behind the scenes of torch.compile()but it's how I understand it.\n",
    "\n",
    "For more on fusion and graph tracing, I’d recommend Horace He’s [*Making Deep Learning Go Brrrr From First Principles*](https://horace.io/brrr_intro.html) blog post."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Things to note\n",
    "\n",
    "Since PyTorch 2.0 was just released, there are a few limitations with some of the features.\n",
    "\n",
    "One of the main ones being with exporting models.\n",
    "\n",
    "<img src=\"https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/extras-pytorch-2-limitations.png\" alt=\"PyTorch 2 limitations\" width=650/>\n",
    "\n",
    "*There are a few caveats when using the PyTorch 2.0 features, such as not being about to export to mobile devices when using the `torch.compile()` default options. However, there are work arounds to this and improved exporting is on the PyTorch 2.x roadmap. *Source:* [PyTorch 2.0 announcement post](https://pytorch.org/get-started/pytorch-2.0/).*\n",
    "\n",
    "However, these will likely be fixed in future releases.\n",
    "\n",
    "Another main limitation is that because the features of PyTorch 2.0 are designed for newer hardware, old GPUs and desktop-class GPUs (e.g. NVIDIA RTX 30 series) will likely see less speedups than newer hardware."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What we're going to cover\n",
    "\n",
    "Since many of the upgrades in PyTorch 2.0 are speed focused and happen behind the scenes (e.g. PyTorch takes care of them for you), in this notebook we're going to run a comparative speed test.\n",
    "\n",
    "Namely we'll make two of the same models, one using the default PyTorch setup and the other using the new `torch.compile()` setup and we'll train them on the same dataset.\n",
    "\n",
    "1. Model 1 - no `torch.compile()`.\n",
    "2. Model 2 - `torch.compile()`.\n",
    "\n",
    "We'll then compare the training/testing times of both models for single run and multiple runs.\n",
    "\n",
    "| **Experiment** | **Model** | **Data** | **Epochs** | **Batch size** | **Image size** | **`torch.compile()`** |  \n",
    "|----- |-----| -----| -----| -----| -----| -----|\n",
    "| 1 (single run) | [ResNet50](https://pytorch.org/vision/master/models/generated/torchvision.models.resnet50.html) | [CIFAR10](https://pytorch.org/vision/stable/generated/torchvision.datasets.CIFAR10.html#torchvision.datasets.CIFAR10) | 5 | 128 | 224 | No |\n",
    "| 2 (single run) | ResNet50 | CIFAR10 | 5 | 128 | 224 | Yes |\n",
    "| 3 (multi-run) | ResNet50 | CIFAR10 | 3x5 | 128 | 224 | No |\n",
    "| 4 (multi-run) | ResNet50 | CIFAR10 | 3x5 | 128 | 224 | Yes |\n",
    "\n",
    "We've chosen ResNet50 and CIFAR10 here for ease of access or use, however, you could substitute any model/dataset you like.\n",
    "\n",
    "The biggest speedups I've noticed with PyTorch 2.0 are when the GPU computes on as much data as possible (e.g. larger batch size/image size/data size/model size). \n",
    "\n",
    "> **Note:** Depending on the size of your GPU, you may have to lower the batch size (or image size) to fit the model on your GPU. For example, if you're using a GPU with 8GB of memory or less, you may have to lower the batch size to 64 or 32."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0. Getting setup\n",
    "\n",
    "To get setup we'll first check for PyTorch 2.x+ and install it if it's not available. \n",
    "\n",
    "You can see how to install PyTorch 2.x on your own system in the [PyTorch documentation](https://pytorch.org/get-started/locally/).\n",
    "\n",
    "> **Note:** If you're running on Google Colab, you'll need to setup a GPU: runtime -> change runtime type -> hardware accelerator. The best speedups are on newer NVIDIA/AMD GPUs (this is because PyTorch 2.0 leverages newer GPU hardware) such as the NVIDIA A100 and above. This tutorial focuses on NVIDIA GPUs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] Current PyTorch version: 2.0.0+cu118 (should be 2.x+)\n",
      "[INFO] PyTorch 2.x installed, you'll be able to use the new features.\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "\n",
    "# Check PyTorch version\n",
    "pt_version = torch.__version__\n",
    "print(f\"[INFO] Current PyTorch version: {pt_version} (should be 2.x+)\")\n",
    "\n",
    "# Install PyTorch 2.0 if necessary\n",
    "if pt_version.split(\".\")[0] == \"1\": # Check if PyTorch version begins with 1 \n",
    "    !pip3 install -U torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118\n",
    "    print(\"[INFO] PyTorch 2.x installed, if you're on Google Colab, you may need to restart your runtime.\\\n",
    "          Though as of April 2023, Google Colab comes with PyTorch 2.0 pre-installed.\")\n",
    "    import torch\n",
    "    pt_version = torch.__version__\n",
    "    print(f\"[INFO] Current PyTorch version: {pt_version} (should be 2.x+)\")\n",
    "else:\n",
    "    print(\"[INFO] PyTorch 2.x installed, you'll be able to use the new features.\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wonderful!\n",
    "\n",
    "Now PyTorch 2.x is installed, let's try out the new features!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Get GPU info\n",
    "\n",
    "Time to get GPU info.\n",
    "\n",
    "Why?\n",
    "\n",
    "Many of the speedups PyTorch 2.0 offers are best experienced on newer NVIDIA GPUs (we're focused on NVIDIA GPUs for now).\n",
    "\n",
    "This is because PyTorch 2.0 takes advantage of the new hardware on newer GPUs.\n",
    "\n",
    "How do you tell what's a newer GPU?\n",
    "\n",
    "Generally, a *newer* GPU will have a compute capability score of 8.0 or higher.\n",
    "\n",
    "You can see a list of [NVIDIA GPU compute capability scores](https://developer.nvidia.com/cuda-gpus) on NVIDIA's developer page.\n",
    "\n",
    "Here are some scores of NVIDIA GPUs released in 2020 or later:\n",
    "\n",
    "| **NVIDIA GPU** | **Compute capability score** | **GPU Type** | **Release year** | **Architecture** |\n",
    "|----- |-----| -----| -----| -----| \n",
    "| RTX 4090 | 8.9 | Desktop-class | 2022 | [Ada Lovelace](https://www.nvidia.com/en-au/geforce/ada-lovelace-architecture/) |\n",
    "| RTX 4080 | 8.9 | Desktop-class | 2022 | Ada Lovelace |\n",
    "| RTX 4070 Ti | 8.9 | Desktop-class | 2022 | Ada Lovelace |\n",
    "| RTX 3090 | 8.6 | Desktop-class | 2020 | [Ampere](https://en.wikipedia.org/wiki/Ampere_(microarchitecture)) |\n",
    "| RTX 3080 | 8.6 | Desktop-class | 2020 | Ampere| \n",
    "| RTX 3070 | 8.6 | Desktop-class | 2020 | Ampere |  \n",
    "| RTX 3060 Ti | 8.6 | Desktop-class | 2020 | Ampere | \n",
    "| H100 | 9.0 | Datacenter-class | 2022 | [Hopper](https://developer.nvidia.com/blog/nvidia-hopper-architecture-in-depth/) | \n",
    "| A100 | 8.0 | Datacenter-class | 2020 | Ampere |\n",
    "| A10 | 8.6 | Datacenter-class | 2021 | Ampere |\n",
    "\n",
    "GPUs with a compute capability score of 8.0 or above are likely to see the biggest speedups.\n",
    "\n",
    "And GPUs which are datacenter-class (e.g. A100, A10, H100) are likely to see more significant speedups than desktop-class GPUs (e.g. RTX 3090, RTX 3080, RTX 3070, RTX 3060 Ti).\n",
    "\n",
    "We can check the compute capability score of our GPU using [`torch.cuda.get_device_capability()`](https://pytorch.org/docs/stable/generated/torch.cuda.get_device_capability.html).\n",
    "\n",
    "This will output a tuple of `(major, minor)` compute capability scores, for example, `(8, 0)` for the A100.\n",
    "\n",
    "We'll also get some other details about our GPU such as the name and other info using [`nvidia-smi`](https://developer.nvidia.com/nvidia-system-management-interface).  \n",
    "\n",
    "> **Resource:** For an in-depth comparison of many different NVIDIA GPUs and their speeds, costs and tradeoffs, I'd recommend reading Tim Dettmers' [*Which GPU for deep learning?*](https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/) blog post."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "GPU name: NVIDIA_TITAN_RTX\n",
      "GPU capability score: (7, 5)\n",
      "GPU score lower than (8, 0), PyTorch 2.x speedup features will be limited (PyTorch 2.x speedups happen most on newer GPUs).\n",
      "GPU information:\n",
      "Fri Apr 14 15:24:38 2023       \n",
      "+-----------------------------------------------------------------------------+\n",
      "| NVIDIA-SMI 525.89.02    Driver Version: 525.89.02    CUDA Version: 12.0     |\n",
      "|-------------------------------+----------------------+----------------------+\n",
      "| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n",
      "| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n",
      "|                               |                      |               MIG M. |\n",
      "|===============================+======================+======================|\n",
      "|   0  NVIDIA TITAN RTX    Off  | 00000000:01:00.0 Off |                  N/A |\n",
      "| 40%   50C    P8     9W / 280W |    260MiB / 24576MiB |      0%      Default |\n",
      "|                               |                      |                  N/A |\n",
      "+-------------------------------+----------------------+----------------------+\n",
      "                                                                               \n",
      "+-----------------------------------------------------------------------------+\n",
      "| Processes:                                                                  |\n",
      "|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n",
      "|        ID   ID                                                   Usage      |\n",
      "|=============================================================================|\n",
      "|    0   N/A  N/A      1020      G   /usr/lib/xorg/Xorg                 53MiB |\n",
      "|    0   N/A  N/A   1415245      G   /usr/lib/xorg/Xorg                162MiB |\n",
      "|    0   N/A  N/A   1415374      G   /usr/bin/gnome-shell                8MiB |\n",
      "+-----------------------------------------------------------------------------+\n"
     ]
    }
   ],
   "source": [
    "# Make sure we're using a NVIDIA GPU\n",
    "if torch.cuda.is_available():\n",
    "  gpu_info = !nvidia-smi\n",
    "  gpu_info = '\\n'.join(gpu_info)\n",
    "  if gpu_info.find(\"failed\") >= 0:\n",
    "    print(\"Not connected to a GPU, to leverage the best of PyTorch 2.0, you should connect to a GPU.\")\n",
    "\n",
    "  # Get GPU name\n",
    "  gpu_name = !nvidia-smi --query-gpu=gpu_name --format=csv\n",
    "  gpu_name = gpu_name[1]\n",
    "  GPU_NAME = gpu_name.replace(\" \", \"_\") # remove underscores for easier saving\n",
    "  print(f'GPU name: {GPU_NAME}')\n",
    "\n",
    "  # Get GPU capability score\n",
    "  GPU_SCORE = torch.cuda.get_device_capability()\n",
    "  print(f\"GPU capability score: {GPU_SCORE}\")\n",
    "  if GPU_SCORE >= (8, 0):\n",
    "    print(f\"GPU score higher than or equal to (8, 0), PyTorch 2.x speedup features available.\")\n",
    "  else:\n",
    "    print(f\"GPU score lower than (8, 0), PyTorch 2.x speedup features will be limited (PyTorch 2.x speedups happen most on newer GPUs).\")\n",
    "  \n",
    "  # Print GPU info\n",
    "  print(f\"GPU information:\\n{gpu_info}\")\n",
    "\n",
    "else:\n",
    "  print(\"PyTorch couldn't find a GPU, to leverage the best of PyTorch 2.0, you should connect to a GPU.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.1 Globally set devices\n",
    "\n",
    "One of my favourite new features in PyTorch 2.x is being able to set the [default device type](https://pytorch.org/tutorials/recipes/recipes/changing_default_device.html ) via:\n",
    "* Context manager\n",
    "* Globally\n",
    "\n",
    "Previously, you could only set the default device type via:\n",
    "* `tensor.to(device)`\n",
    "\n",
    "Let's see these two new device settings in action."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Layer weights are on device: cuda:0\n",
      "Layer creating data on device: cuda:0\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "\n",
    "# Set the device\n",
    "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
    "\n",
    "# Set the device with context manager (requires PyTorch 2.x+)\n",
    "with torch.device(device):\n",
    "    # All tensors created in this block will be on device\n",
    "    layer = torch.nn.Linear(20, 30)\n",
    "    print(f\"Layer weights are on device: {layer.weight.device}\")\n",
    "    print(f\"Layer creating data on device: {layer(torch.randn(128, 20)).device}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now how about setting the global device?\n",
    "\n",
    "This will mean that any tensors created without an explicit device will be created on the device you set by default."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Layer weights are on device: cuda:0\n",
      "Layer creating data on device: cuda:0\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "\n",
    "# Set the device\n",
    "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
    "\n",
    "# Set the device globally\n",
    "torch.set_default_device(device)\n",
    "\n",
    "# All tensors created will be on the global device by default\n",
    "layer = torch.nn.Linear(20, 30)\n",
    "print(f\"Layer weights are on device: {layer.weight.device}\")\n",
    "print(f\"Layer creating data on device: {layer(torch.randn(128, 20)).device}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And now back to CPU."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Layer weights are on device: cpu\n",
      "Layer creating data on device: cpu\n"
     ]
    }
   ],
   "source": [
    "import torch \n",
    "\n",
    "# Set the device globally\n",
    "torch.set_default_device(\"cpu\")\n",
    "\n",
    "# All tensors created will be on \"cpu\"\n",
    "layer = torch.nn.Linear(20, 30)\n",
    "print(f\"Layer weights are on device: {layer.weight.device}\")\n",
    "print(f\"Layer creating data on device: {layer(torch.randn(128, 20)).device}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Setting up the experiments \n",
    "\n",
    "Okay, time to measure speed!\n",
    "\n",
    "To keep things simple, as we discussed we're going to run a series of four experiments, all with:\n",
    "\n",
    "* **Model:** ResNet50 (from [TorchVision](https://pytorch.org/vision/main/models/generated/torchvision.models.resnet50.html))\n",
    "* **Data:** CIFAR10 (from [TorchVision](https://pytorch.org/vision/main/generated/torchvision.datasets.CIFAR10.html))\n",
    "* **Epochs:** 5 (single run) and 3x5 (multiple runs)\n",
    "* **Batch size:** 128\n",
    "* **Image size:** 224\n",
    "\n",
    "Each experiment will be run with and without `torch.compile()`.\n",
    "\n",
    "Why the single and multiple runs?\n",
    "\n",
    "Because we can measure speedups via a single run, however, we'll also want to run the tests multiple times to get an average (just to make sure the results from a single run weren't a fluke or something went wrong). \n",
    "\n",
    "> **Note:** Depending on the amount of memory your GPU has, you may have to lower the batch size or the image size. This tutorial is focused on using an NVIDIA A100 GPU with 40GB of memory, the amount of memory on this GPU means it can handle a larger batch size. As of April 2023, NVIDIA A100 GPUs are available via Google Colab Pro. \n",
    "\n",
    "Let's start by importing `torch` and `torchvision` and setting the target device."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "PyTorch version: 2.0.0+cu118\n",
      "TorchVision version: 0.15.1+cu118\n",
      "Using device: cuda\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "import torchvision\n",
    "\n",
    "print(f\"PyTorch version: {torch.__version__}\")\n",
    "print(f\"TorchVision version: {torchvision.__version__}\")\n",
    "\n",
    "# Set the target device\n",
    "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
    "\n",
    "print(f\"Using device: {device}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1 Create model and transforms\n",
    "\n",
    "Let's now create our model and transforms.\n",
    "\n",
    "We'll use the same setup to create the model and transforms we covered in [06. PyTorch Transfer Learning section 2.2](https://www.learnpytorch.io/06_pytorch_transfer_learning/).\n",
    "\n",
    "In essence, we'll create the model and transforms for the model using the [`torchvision.models`](https://pytorch.org/vision/stable/models.html) API.\n",
    "\n",
    "We can get the weights and transforms for ResNet50 using the following:\n",
    "* `model_weights = torchvision.models.ResNet50_Weights.IMAGENET1K_V2` (this requires `torchvision` 0.14 or later).\n",
    "* `transforms = model_weights.transforms()` (once we have the weights, we can get the appropriate transforms for the model).\n",
    "\n",
    "> **Note:** We'll count the model's parameters to see how big of a model we're working with. The more parameters in a model, the larger GPU memory you'll need to train it. However, the more parameters your model has, the more GPU memory it uses, the larger *relative* speedup you'll often see. Meaning, a larger model may take longer to train in total, however, on a relative basis because it's using more GPU power, it could be faster than a smaller model. As in, a model with 10M parameters may take only 5x longer to train than a model with 1M parameters (10x the size but only 5x the training time). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total parameters of model: 25557032 (the more parameters, the more GPU memory the model will use, the more *relative* of a speedup you'll get)\n",
      "Model transforms:\n",
      "ImageClassification(\n",
      "    crop_size=[224]\n",
      "    resize_size=[232]\n",
      "    mean=[0.485, 0.456, 0.406]\n",
      "    std=[0.229, 0.224, 0.225]\n",
      "    interpolation=InterpolationMode.BILINEAR\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "# Create model weights and transforms\n",
    "model_weights = torchvision.models.ResNet50_Weights.IMAGENET1K_V2 # <- use the latest weights (could also use .DEFAULT)\n",
    "transforms = model_weights.transforms()\n",
    "\n",
    "# Setup model\n",
    "model = torchvision.models.resnet50(weights=model_weights)\n",
    "\n",
    "# Count the number of parameters in the model \n",
    "total_params = sum(\n",
    "    param.numel() for param in model.parameters() # <- all params\n",
    "\t# param.numel() for param in model.parameters() if param.requires_grad # <- only trainable params\n",
    ")\n",
    "\n",
    "print(f\"Total parameters of model: {total_params} (the more parameters, the more GPU memory the model will use, the more *relative* of a speedup you'll get)\")\n",
    "print(f\"Model transforms:\\n{transforms}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's turn the above code into a function so we can replicate it later, we'll also adjust the last layer's (`model.fc`) output features to match the number of classes in CIFAR10 (10)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_model(num_classes=10):\n",
    "  \"\"\"\n",
    "  Creates a ResNet50 model with the latest weights and transforms via torchvision.\n",
    "  \"\"\"\n",
    "  model_weights = torchvision.models.ResNet50_Weights.IMAGENET1K_V2\n",
    "  transforms = model_weights.transforms()\n",
    "  model = torchvision.models.resnet50(weights=model_weights)\n",
    "  \n",
    "  # Adjust the number of output features in model to match the number of classes in the dataset\n",
    "  model.fc = torch.nn.Linear(in_features=2048, \n",
    "                             out_features=num_classes)\n",
    "  return model, transforms\n",
    "\n",
    "model, transforms = create_model()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.2 Speedups are most noticeable when a large portion of the GPU is being used\n",
    "\n",
    "Since modern GPUs are so fast at performing operations, you will often notice the majority of *relative* speedups when as much data as possible is on the GPU.\n",
    "\n",
    "This can be achieved by:\n",
    "* **Increasing the batch size** - More samples per batch means more samples on the GPU, for example, using a batch size of 256 instead of 32.\n",
    "* **Increasing data size** - For example, using larger image size, 224x224 instead of 32x32. A larger data size means that more tensor operations will be happening on the GPU.\n",
    "* **Increasing model size** - For example, using a larger model such as ResNet101 instead of ResNet50. A larger model means that more tensor operations will be happening on the GPU.\n",
    "* **Decreasing data transfer** - For example, setting up all your tensors to be on GPU memory, this minimizes the amount of data transfer between the CPU and GPU.\n",
    "\n",
    "All of these result in *more* data being on the GPU.\n",
    "\n",
    "<img src=\"https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/extras-speedups-are-biggest-when-more-gpu-is-used.png\" width=950 alt=\"speedups are biggest when more of the GPU is used\"/>\n",
    "\n",
    "You may be thinking, \"but doesn't this mean that the GPU will be slower because it has to do more work?\"\n",
    "\n",
    "This is correct, operations may take longer when using *more* data on the GPU, however, they benefit from [parallelism](https://en.wikipedia.org/wiki/Parallel_computing) (many operations happening at once).\n",
    "\n",
    "This means that although *more* operations are happening, the GPU is performing as many of them as possible simultaneously.\n",
    "\n",
    "So while you may see speedups with smaller datasets, models, batch sizes and data sizes, however, you will tend to see the *biggest relative* speedups with increasing scale.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.3 Checking the memory limits of our GPU\n",
    "\n",
    "To take advantage of speedups at scale, let's check how much memory our GPU has.\n",
    "\n",
    "If your GPU has less memory, you may need to decrease the batch size or image size (less potential for speedups).\n",
    "\n",
    "We can check the memory available on our GPU using [`torch.cuda.mem_get_info()`](https://pytorch.org/docs/stable/generated/torch.cuda.mem_get_info.html#torch.cuda.mem_get_info).\n",
    "\n",
    "This will return a tuple of `(total_free_gpu_memory, total_gpu_memory)`.\n",
    "\n",
    "Where:\n",
    "* `total_free_gpu_memory` is the amount of memory currently *not being used* on the GPU in bytes.\n",
    "* `total_gpu_memory` is the total amount of memory available on the GPU in bytes. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total free GPU memory: 24.187 GB\n",
      "Total GPU memory: 25.386 GB\n"
     ]
    }
   ],
   "source": [
    "# Check available GPU memory and total GPU memory \n",
    "total_free_gpu_memory, total_gpu_memory = torch.cuda.mem_get_info()\n",
    "print(f\"Total free GPU memory: {round(total_free_gpu_memory * 1e-9, 3)} GB\")\n",
    "print(f\"Total GPU memory: {round(total_gpu_memory * 1e-9, 3)} GB\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wonderful!\n",
    "\n",
    "The takeaways here are:\n",
    "1. The higher the memory available on your GPU, **the bigger your batch size can be, the bigger your model can be, the bigger your data samples can be**. \n",
    "2. For speedups, you should always be trying to use **as much of the GPU(s) as possible**.\n",
    "\n",
    "Let's write some code to use a larger batch size if more GPU memory is available.\n",
    "\n",
    "> **Note:** The ideal batch size you use will depend on the specific GPU and dataset and model you're working with. The code below is specifically targeted for the A100 GPU available on Google Colab Pro. However, you may adjust it for your own GPU. As if you set the batch size too high, you may run into CUDA out of memory errors.\n",
    "\n",
    "If the total memory on the GPU available is **above 16GB**, let's use a batch size of 128 and an image size of 224 (both of these values can be increased on GPUs with more memory).\n",
    "\n",
    "If the total memory on the GPU available is **below 16GB**, let's use a batch size of 32 and an image size of 64 (both of these values can be altered on GPUs with less memory)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "GPU memory available is 24.187 GB, using batch size of 128 and image size 224\n"
     ]
    }
   ],
   "source": [
    "# Set batch size depending on amount of GPU memory\n",
    "total_free_gpu_memory_gb = round(total_free_gpu_memory * 1e-9, 3)\n",
    "if total_free_gpu_memory_gb >= 16:\n",
    "  BATCH_SIZE = 128 # Note: you could experiment with higher values here if you like.\n",
    "  IMAGE_SIZE = 224\n",
    "  print(f\"GPU memory available is {total_free_gpu_memory_gb} GB, using batch size of {BATCH_SIZE} and image size {IMAGE_SIZE}\")\n",
    "else:\n",
    "  BATCH_SIZE = 32\n",
    "  IMAGE_SIZE = 128\n",
    "  print(f\"GPU memory available is {total_free_gpu_memory_gb} GB, using batch size of {BATCH_SIZE} and image size {IMAGE_SIZE}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's adjust the `transforms` to use the respective `IMAGE_SIZE`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Updated data transforms:\n",
      "ImageClassification(\n",
      "    crop_size=224\n",
      "    resize_size=224\n",
      "    mean=[0.485, 0.456, 0.406]\n",
      "    std=[0.229, 0.224, 0.225]\n",
      "    interpolation=InterpolationMode.BILINEAR\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "transforms.crop_size = IMAGE_SIZE\n",
    "transforms.resize_size = IMAGE_SIZE \n",
    "print(f\"Updated data transforms:\\n{transforms}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.4  More potential speedups with TF32\n",
    "\n",
    "TF32 stands for TensorFloat-32, a data format which is a combination of 16-bit and 32-bit floating point numbers.\n",
    "\n",
    "You can read more about how it works on [NVIDIA's blog](https://blogs.nvidia.com/blog/2020/05/14/tensorfloat-32-precision-format/).\n",
    "\n",
    "The main thing you should know is that it allows you to **perform faster matrix multiplications** on GPUs with the Ampere architecture and above (a compute capability score of 8.0+).\n",
    "\n",
    "Although it's not specific to PyTorch 2.0, since we're talking about newer GPUs, it's worth mentioning.\n",
    "\n",
    "If you're using a GPU with a compute capability score of 8.0 or above, you can enable TF32 by setting [`torch.backends.cuda.matmul.allow_tf32 = True`](https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices) (this defaults to `False`).\n",
    "\n",
    "Let's write a check that sets it automatically for us based on our GPUs compute capability score.\n",
    "\n",
    "> **Note:** TensorFloat32 is disabled by default (set to `False`) in PyTorch versions 1.12 onwards. This is because it [may cause inconsistent results across different devices](https://dev-discuss.pytorch.org/t/pytorch-and-tensorfloat32/504). Although this issue is not noticed for all use cases, it's worth knowing. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] Using GPU with score: (7, 5), TensorFloat32 (TF32) not available, to use it you need a GPU with score >= (8, 0)\n"
     ]
    }
   ],
   "source": [
    "if GPU_SCORE >= (8, 0):\n",
    "  print(f\"[INFO] Using GPU with score: {GPU_SCORE}, enabling TensorFloat32 (TF32) computing (faster on new GPUs)\")\n",
    "  torch.backends.cuda.matmul.allow_tf32 = True\n",
    "else:\n",
    "  print(f\"[INFO] Using GPU with score: {GPU_SCORE}, TensorFloat32 (TF32) not available, to use it you need a GPU with score >= (8, 0)\")\n",
    "  torch.backends.cuda.matmul.allow_tf32 = False"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.5 Preparing datasets\n",
    "\n",
    "Computing setup done!\n",
    "\n",
    "Let's now create our datasets.\n",
    "\n",
    "To keep things simple, we'll use [CIFAR10](https://pytorch.org/vision/main/generated/torchvision.datasets.CIFAR10.html) since it's readily available in `torchvision`.\n",
    "\n",
    "Some info about CIFAR10 the [CIFAR10 website](https://www.cs.toronto.edu/~kriz/cifar.html):\n",
    "\n",
    "* CIFAR10 is a dataset of 60,000 32x32 color images in 10 classes, with 6,000 images per class.\n",
    "* There are 50,000 training images and 10,000 test images.\n",
    "* The dataset contains 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck.\n",
    "\n",
    "Although the original dataset consists of 32x32 images, we'll use the `transforms` we created earlier to resize them to 224x224 (larger images provide more information and will take up more memory on the GPU)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Files already downloaded and verified\n",
      "Files already downloaded and verified\n",
      "[INFO] Train dataset length: 50000\n",
      "[INFO] Test dataset length: 10000\n"
     ]
    }
   ],
   "source": [
    "# Create train and test datasets\n",
    "train_dataset = torchvision.datasets.CIFAR10(root='.', \n",
    "                                             train=True, \n",
    "                                             download=True, \n",
    "                                             transform=transforms)\n",
    "\n",
    "test_dataset = torchvision.datasets.CIFAR10(root='.', \n",
    "                                            train=False, # want the test split\n",
    "                                            download=True, \n",
    "                                            transform=transforms)\n",
    "\n",
    "# Get the lengths of the datasets\n",
    "train_len = len(train_dataset)\n",
    "test_len = len(test_dataset)\n",
    "\n",
    "print(f\"[INFO] Train dataset length: {train_len}\")\n",
    "print(f\"[INFO] Test dataset length: {test_len}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.6 Create DataLoaders\n",
    "\n",
    "Generally GPUs aren't the bottleneck of machine learning code.\n",
    "\n",
    "Data loading is the main bottleneck.\n",
    "\n",
    "As in, the transfer speed from CPU to GPU.\n",
    "\n",
    "As we've discussed before you want to get your data to the GPU as fast as possible.\n",
    "\n",
    "Let's create our `DataLoaders` using `torch.utils.data.DataLoader`.\n",
    "\n",
    "We'll set their `batch_size` to the `BATCH_SIZE` we created earlier.\n",
    "\n",
    "And the `num_workers` parameter to be the number of CPU cores we have available with `os.cpu_count()`.\n",
    "\n",
    "> **Note:** You may want to experiment with different values for `num_workers` to see what works best for your specific GPU and CPU setup. In my experience, more is better but some people have found this [generally caps out](https://discuss.pytorch.org/t/guidelines-for-assigning-num-workers-to-dataloader/813/3) at `4 * number_of_gpus_you_have`, for example, `num_workers = 4 * 1` for 1 GPU."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Train dataloader length: 391 batches of size 128\n",
      "Test dataloader length: 79 batches of size 128\n",
      "Using number of workers: 16 (generally more workers means faster dataloading from CPU to GPU)\n"
     ]
    }
   ],
   "source": [
    "from torch.utils.data import DataLoader\n",
    "\n",
    "# Create DataLoaders\n",
    "import os\n",
    "NUM_WORKERS = os.cpu_count() # <- use all available CPU cores (this number can be tweaked through experimentation but generally more workers means faster dataloading from CPU to GPU)\n",
    "\n",
    "train_dataloader = DataLoader(dataset=train_dataset,\n",
    "                              batch_size=BATCH_SIZE,\n",
    "                              shuffle=True,\n",
    "                              num_workers=NUM_WORKERS)\n",
    "\n",
    "test_dataloader = DataLoader(dataset=test_dataset,\n",
    "                              batch_size=BATCH_SIZE,\n",
    "                              shuffle=False,\n",
    "                              num_workers=NUM_WORKERS)\n",
    "\n",
    "# Print details\n",
    "print(f\"Train dataloader length: {len(train_dataloader)} batches of size {BATCH_SIZE}\")\n",
    "print(f\"Test dataloader length: {len(test_dataloader)} batches of size {BATCH_SIZE}\")\n",
    "print(f\"Using number of workers: {NUM_WORKERS} (generally more workers means faster dataloading from CPU to GPU)\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.7 Create training and testing loops\n",
    "\n",
    "Dataloaders ready!\n",
    "\n",
    "Let's now create some training and testing loops.\n",
    "\n",
    "These will be the same training and testing loops we created in [05. PyTorch Going Modular](https://www.learnpytorch.io/05_pytorch_going_modular/) with some slight modifications.\n",
    "\n",
    "Since we're focused on measuring speed, we're going to add a timing component to each loop to measure how long each takes to complete.\n",
    "\n",
    "We'll do this by measuring the start and end time of each training and testing epoch with Python's [`time.time()`](https://docs.python.org/3/library/time.html#time.time) and tracking it in a dictionary.\n",
    "\n",
    "> **Note:** One thing I found when experimenting with PyTorch 2.0 is that [`torch.inference_mode()`](https://pytorch.org/docs/stable/generated/torch.inference_mode.html) produced errors in the testing loop. So I've changed it to be [`torch.no_grad()`](https://pytorch.org/docs/stable/generated/torch.no_grad.html) which offers similar functionality but is an older method than `torch.inference_mode()`. If you find that `torch.inference_mode()` works for you, please [let me know on GitHub](https://github.com/mrdbourke/pytorch-deep-learning/discussions) and I'll update this notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time\n",
    "from tqdm.auto import tqdm\n",
    "from typing import Dict, List, Tuple\n",
    "\n",
    "def train_step(epoch: int,\n",
    "               model: torch.nn.Module, \n",
    "               dataloader: torch.utils.data.DataLoader, \n",
    "               loss_fn: torch.nn.Module, \n",
    "               optimizer: torch.optim.Optimizer,\n",
    "               device: torch.device,\n",
    "               disable_progress_bar: bool = False) -> Tuple[float, float]:\n",
    "  \"\"\"Trains a PyTorch model for a single epoch.\n",
    "\n",
    "  Turns a target PyTorch model to training mode and then\n",
    "  runs through all of the required training steps (forward\n",
    "  pass, loss calculation, optimizer step).\n",
    "\n",
    "  Args:\n",
    "    model: A PyTorch model to be trained.\n",
    "    dataloader: A DataLoader instance for the model to be trained on.\n",
    "    loss_fn: A PyTorch loss function to minimize.\n",
    "    optimizer: A PyTorch optimizer to help minimize the loss function.\n",
    "    device: A target device to compute on (e.g. \"cuda\" or \"cpu\").\n",
    "\n",
    "  Returns:\n",
    "    A tuple of training loss and training accuracy metrics.\n",
    "    In the form (train_loss, train_accuracy). For example:\n",
    "\n",
    "    (0.1112, 0.8743)\n",
    "  \"\"\"\n",
    "  # Put model in train mode\n",
    "  model.train()\n",
    "\n",
    "  # Setup train loss and train accuracy values\n",
    "  train_loss, train_acc = 0, 0\n",
    "\n",
    "  # Loop through data loader data batches\n",
    "  progress_bar = tqdm(\n",
    "        enumerate(dataloader), \n",
    "        desc=f\"Training Epoch {epoch}\", \n",
    "        total=len(dataloader),\n",
    "        disable=disable_progress_bar\n",
    "    )\n",
    "\n",
    "  for batch, (X, y) in progress_bar:\n",
    "      # Send data to target device\n",
    "      X, y = X.to(device), y.to(device)\n",
    "\n",
    "      # 1. Forward pass\n",
    "      y_pred = model(X)\n",
    "\n",
    "      # 2. Calculate  and accumulate loss\n",
    "      loss = loss_fn(y_pred, y)\n",
    "      train_loss += loss.item() \n",
    "\n",
    "      # 3. Optimizer zero grad\n",
    "      optimizer.zero_grad()\n",
    "\n",
    "      # 4. Loss backward\n",
    "      loss.backward()\n",
    "\n",
    "      # 5. Optimizer step\n",
    "      optimizer.step()\n",
    "\n",
    "      # Calculate and accumulate accuracy metrics across all batches\n",
    "      y_pred_class = torch.argmax(torch.softmax(y_pred, dim=1), dim=1)\n",
    "      train_acc += (y_pred_class == y).sum().item()/len(y_pred)\n",
    "\n",
    "      # Update progress bar\n",
    "      progress_bar.set_postfix(\n",
    "            {\n",
    "                \"train_loss\": train_loss / (batch + 1),\n",
    "                \"train_acc\": train_acc / (batch + 1),\n",
    "            }\n",
    "        )\n",
    "\n",
    "\n",
    "  # Adjust metrics to get average loss and accuracy per batch \n",
    "  train_loss = train_loss / len(dataloader)\n",
    "  train_acc = train_acc / len(dataloader)\n",
    "  return train_loss, train_acc\n",
    "\n",
    "def test_step(epoch: int,\n",
    "              model: torch.nn.Module, \n",
    "              dataloader: torch.utils.data.DataLoader, \n",
    "              loss_fn: torch.nn.Module,\n",
    "              device: torch.device,\n",
    "              disable_progress_bar: bool = False) -> Tuple[float, float]:\n",
    "  \"\"\"Tests a PyTorch model for a single epoch.\n",
    "\n",
    "  Turns a target PyTorch model to \"eval\" mode and then performs\n",
    "  a forward pass on a testing dataset.\n",
    "\n",
    "  Args:\n",
    "    model: A PyTorch model to be tested.\n",
    "    dataloader: A DataLoader instance for the model to be tested on.\n",
    "    loss_fn: A PyTorch loss function to calculate loss on the test data.\n",
    "    device: A target device to compute on (e.g. \"cuda\" or \"cpu\").\n",
    "\n",
    "  Returns:\n",
    "    A tuple of testing loss and testing accuracy metrics.\n",
    "    In the form (test_loss, test_accuracy). For example:\n",
    "\n",
    "    (0.0223, 0.8985)\n",
    "  \"\"\"\n",
    "  # Put model in eval mode\n",
    "  model.eval() \n",
    "\n",
    "  # Setup test loss and test accuracy values\n",
    "  test_loss, test_acc = 0, 0\n",
    "\n",
    "  # Loop through data loader data batches\n",
    "  progress_bar = tqdm(\n",
    "      enumerate(dataloader), \n",
    "      desc=f\"Testing Epoch {epoch}\", \n",
    "      total=len(dataloader),\n",
    "      disable=disable_progress_bar\n",
    "  )\n",
    "\n",
    "  # Turn on inference context manager\n",
    "  with torch.no_grad(): # no_grad() required for PyTorch 2.0, I found some errors with `torch.inference_mode()`, please let me know if this is not the case\n",
    "      # Loop through DataLoader batches\n",
    "      for batch, (X, y) in progress_bar:\n",
    "          # Send data to target device\n",
    "          X, y = X.to(device), y.to(device)\n",
    "\n",
    "          # 1. Forward pass\n",
    "          test_pred_logits = model(X)\n",
    "\n",
    "          # 2. Calculate and accumulate loss\n",
    "          loss = loss_fn(test_pred_logits, y)\n",
    "          test_loss += loss.item()\n",
    "\n",
    "          # Calculate and accumulate accuracy\n",
    "          test_pred_labels = test_pred_logits.argmax(dim=1)\n",
    "          test_acc += ((test_pred_labels == y).sum().item()/len(test_pred_labels))\n",
    "\n",
    "          # Update progress bar\n",
    "          progress_bar.set_postfix(\n",
    "              {\n",
    "                  \"test_loss\": test_loss / (batch + 1),\n",
    "                  \"test_acc\": test_acc / (batch + 1),\n",
    "              }\n",
    "          )\n",
    "\n",
    "  # Adjust metrics to get average loss and accuracy per batch \n",
    "  test_loss = test_loss / len(dataloader)\n",
    "  test_acc = test_acc / len(dataloader)\n",
    "  return test_loss, test_acc\n",
    "\n",
    "def train(model: torch.nn.Module, \n",
    "          train_dataloader: torch.utils.data.DataLoader, \n",
    "          test_dataloader: torch.utils.data.DataLoader, \n",
    "          optimizer: torch.optim.Optimizer,\n",
    "          loss_fn: torch.nn.Module,\n",
    "          epochs: int,\n",
    "          device: torch.device,\n",
    "          disable_progress_bar: bool = False) -> Dict[str, List]:\n",
    "  \"\"\"Trains and tests a PyTorch model.\n",
    "\n",
    "  Passes a target PyTorch models through train_step() and test_step()\n",
    "  functions for a number of epochs, training and testing the model\n",
    "  in the same epoch loop.\n",
    "\n",
    "  Calculates, prints and stores evaluation metrics throughout.\n",
    "\n",
    "  Args:\n",
    "    model: A PyTorch model to be trained and tested.\n",
    "    train_dataloader: A DataLoader instance for the model to be trained on.\n",
    "    test_dataloader: A DataLoader instance for the model to be tested on.\n",
    "    optimizer: A PyTorch optimizer to help minimize the loss function.\n",
    "    loss_fn: A PyTorch loss function to calculate loss on both datasets.\n",
    "    epochs: An integer indicating how many epochs to train for.\n",
    "    device: A target device to compute on (e.g. \"cuda\" or \"cpu\").\n",
    "\n",
    "  Returns:\n",
    "    A dictionary of training and testing loss as well as training and\n",
    "    testing accuracy metrics. Each metric has a value in a list for \n",
    "    each epoch.\n",
    "    In the form: {train_loss: [...],\n",
    "                  train_acc: [...],\n",
    "                  test_loss: [...],\n",
    "                  test_acc: [...]} \n",
    "    For example if training for epochs=2: \n",
    "                 {train_loss: [2.0616, 1.0537],\n",
    "                  train_acc: [0.3945, 0.3945],\n",
    "                  test_loss: [1.2641, 1.5706],\n",
    "                  test_acc: [0.3400, 0.2973]} \n",
    "  \"\"\"\n",
    "  # Create empty results dictionary\n",
    "  results = {\"train_loss\": [],\n",
    "      \"train_acc\": [],\n",
    "      \"test_loss\": [],\n",
    "      \"test_acc\": [],\n",
    "      \"train_epoch_time\": [],\n",
    "      \"test_epoch_time\": []\n",
    "  }\n",
    "\n",
    "  # Loop through training and testing steps for a number of epochs\n",
    "  for epoch in tqdm(range(epochs), disable=disable_progress_bar):\n",
    "\n",
    "      # Perform training step and time it\n",
    "      train_epoch_start_time = time.time()\n",
    "      train_loss, train_acc = train_step(epoch=epoch, \n",
    "                                        model=model,\n",
    "                                        dataloader=train_dataloader,\n",
    "                                        loss_fn=loss_fn,\n",
    "                                        optimizer=optimizer,\n",
    "                                        device=device,\n",
    "                                        disable_progress_bar=disable_progress_bar)\n",
    "      train_epoch_end_time = time.time()\n",
    "      train_epoch_time = train_epoch_end_time - train_epoch_start_time\n",
    "      \n",
    "      # Perform testing step and time it\n",
    "      test_epoch_start_time = time.time()\n",
    "      test_loss, test_acc = test_step(epoch=epoch,\n",
    "                                      model=model,\n",
    "                                      dataloader=test_dataloader,\n",
    "                                      loss_fn=loss_fn,\n",
    "                                      device=device,\n",
    "                                      disable_progress_bar=disable_progress_bar)\n",
    "      test_epoch_end_time = time.time()\n",
    "      test_epoch_time = test_epoch_end_time - test_epoch_start_time\n",
    "\n",
    "      # Print out what's happening\n",
    "      print(\n",
    "          f\"Epoch: {epoch+1} | \"\n",
    "          f\"train_loss: {train_loss:.4f} | \"\n",
    "          f\"train_acc: {train_acc:.4f} | \"\n",
    "          f\"test_loss: {test_loss:.4f} | \"\n",
    "          f\"test_acc: {test_acc:.4f} | \"\n",
    "          f\"train_epoch_time: {train_epoch_time:.4f} | \"\n",
    "          f\"test_epoch_time: {test_epoch_time:.4f}\"\n",
    "      )\n",
    "\n",
    "      # Update results dictionary\n",
    "      results[\"train_loss\"].append(train_loss)\n",
    "      results[\"train_acc\"].append(train_acc)\n",
    "      results[\"test_loss\"].append(test_loss)\n",
    "      results[\"test_acc\"].append(test_acc)\n",
    "      results[\"train_epoch_time\"].append(train_epoch_time)\n",
    "      results[\"test_epoch_time\"].append(test_epoch_time)\n",
    "\n",
    "  # Return the filled results at the end of the epochs\n",
    "  return results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Time models across single run\n",
    "\n",
    "Training and testing functions ready!\n",
    "\n",
    "Time to start training/evaluating and timing our model.\n",
    "\n",
    "We'll start with the first experiment. "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.1 Experiment 1 - Single run, no compile\n",
    "\n",
    "For experiment 1, we'll use the following parameters: \n",
    "\n",
    "| **Experiment** | **Model** | **Data** | **Epochs** | **Batch size** | **Image size** | **`torch.compile()`** |  \n",
    "|----- |-----| -----| -----| -----| -----| -----|\n",
    "| 1 (single run) | ResNet50 | CIFAR10 | 5 | 128 | 224 | No |\n",
    "\n",
    "We'll set the number of epochs to `5` and use a learning rate of `0.003` throughout (you can experiment with different learning rates for better results but we're focused on speed)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set the number of epochs as a constant\n",
    "NUM_EPOCHS = 5\n",
    "\n",
    "# Set the learning rate as a constant (this can be changed to get better results but for now we're just focused on time)\n",
    "LEARNING_RATE = 0.003"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **Note:** Depending on the speed of your GPU, the following code can take a little while to run. For example, it took around 16 minutes on my local NVIDIA TITAN RTX and around 7 minutes on a NVIDIA A100 GPU on Google Colab Pro."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "2d3932b6a6fa4d659b87ba3d26f412e8",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/5 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "2f544154a5b048d98df936ad18774e38",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Training Epoch 0:   0%|          | 0/391 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a48a76ce7d684b5e9efc8b005fd9c05d",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Testing Epoch 0:   0%|          | 0/79 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 1 | train_loss: 0.7734 | train_acc: 0.7333 | test_loss: 0.8021 | test_acc: 0.7477 | train_epoch_time: 184.9701 | test_epoch_time: 12.9893\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "5bc6d82283564dd199bd441333f120f9",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Training Epoch 1:   0%|          | 0/391 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "892d15bbe3944008be8572592b994128",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Testing Epoch 1:   0%|          | 0/79 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 2 | train_loss: 0.4337 | train_acc: 0.8501 | test_loss: 0.4794 | test_acc: 0.8338 | train_epoch_time: 185.3404 | test_epoch_time: 12.9515\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "ed2bb3f592094bbc81098e44f3fbc335",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Training Epoch 2:   0%|          | 0/391 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a74a7d81cdf6474185b4f599d2275c50",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Testing Epoch 2:   0%|          | 0/79 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 3 | train_loss: 0.3055 | train_acc: 0.8944 | test_loss: 0.4282 | test_acc: 0.8533 | train_epoch_time: 185.3870 | test_epoch_time: 13.0559\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "fd4ab8292b1d491a8024b100f0e9235d",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Training Epoch 3:   0%|          | 0/391 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4f6dd56062f746f985630fe2eae64435",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Testing Epoch 3:   0%|          | 0/79 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 4 | train_loss: 0.2268 | train_acc: 0.9198 | test_loss: 0.4387 | test_acc: 0.8580 | train_epoch_time: 185.5914 | test_epoch_time: 13.0495\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "801b1f9317664fa2a1128b69b0180cb4",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Training Epoch 4:   0%|          | 0/391 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "ed406b4b41e54292b23cfef46bac7fdb",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Testing Epoch 4:   0%|          | 0/79 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 5 | train_loss: 0.1723 | train_acc: 0.9395 | test_loss: 0.3901 | test_acc: 0.8754 | train_epoch_time: 185.5304 | test_epoch_time: 13.0517\n"
     ]
    }
   ],
   "source": [
    "# Create model\n",
    "model, transforms = create_model()\n",
    "model.to(device)\n",
    "\n",
    "# Create loss function and optimizer\n",
    "loss_fn = torch.nn.CrossEntropyLoss()\n",
    "optimizer = torch.optim.Adam(model.parameters(),\n",
    "                             lr=LEARNING_RATE)\n",
    "\n",
    "# Train model and track results\n",
    "single_run_no_compile_results = train(model=model,\n",
    "                                      train_dataloader=train_dataloader,\n",
    "                                      test_dataloader=test_dataloader,\n",
    "                                      loss_fn=loss_fn,\n",
    "                                      optimizer=optimizer,\n",
    "                                      epochs=NUM_EPOCHS,\n",
    "                                      device=device)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2 Experiment 2 - Single run, with compile\n",
    "\n",
    "Now we'll do the same experiment but this time we'll use `torch.compile()`.\n",
    "\n",
    "| **Experiment** | **Model** | **Data** | **Epochs** | **Batch size** | **Image size** | **`torch.compile()`** |  \n",
    "|----- |-----| -----| -----| -----| -----| -----|\n",
    "| 2 (single run) | ResNet50 | CIFAR10 | 5 | 128 | 224 | Yes |\n",
    "\n",
    "> **Note:** Depending on the speed of your GPU, the following code can take a little while to run. For example, it took around 16 minutes on my local NVIDIA TITAN RTX and around 7 minutes on a NVIDIA A100 GPU on Google Colab Pro."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time to compile: 0.00491642951965332 | Note: The first time you compile your model, the first few epochs will be slower than subsequent runs.\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "3dffe08523224e548af0d418592dd488",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/5 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "56e51390cff74ffaaaae2c81fa44c734",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Training Epoch 0:   0%|          | 0/391 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "ccd33c0eb0c749ba84e213c439cae923",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Testing Epoch 0:   0%|          | 0/79 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 1 | train_loss: 0.7585 | train_acc: 0.7364 | test_loss: 0.5852 | test_acc: 0.8004 | train_epoch_time: 196.4621 | test_epoch_time: 21.0730\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "e771faaff76e4aa3a02c50f9a44e4e49",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Training Epoch 1:   0%|          | 0/391 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "5a449904b51f4196b15c406e94dd4afb",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Testing Epoch 1:   0%|          | 0/79 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 2 | train_loss: 0.4288 | train_acc: 0.8521 | test_loss: 0.5468 | test_acc: 0.8108 | train_epoch_time: 169.9891 | test_epoch_time: 11.0555\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "107e83a661944b5399375e0b9865fffe",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Training Epoch 2:   0%|          | 0/391 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a951984d0d8a40f99f65d46ef2991fa9",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Testing Epoch 2:   0%|          | 0/79 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 3 | train_loss: 0.3080 | train_acc: 0.8928 | test_loss: 0.4791 | test_acc: 0.8377 | train_epoch_time: 170.4004 | test_epoch_time: 10.9841\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "c8b835b479934323b714e6c0e9619f44",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Training Epoch 3:   0%|          | 0/391 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "0e75958332e24018af652c6d52d2f98a",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Testing Epoch 3:   0%|          | 0/79 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 4 | train_loss: 0.2322 | train_acc: 0.9184 | test_loss: 0.5551 | test_acc: 0.8306 | train_epoch_time: 170.1974 | test_epoch_time: 11.0482\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "e41cd029f0184a0c8f635a824f15922f",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Training Epoch 4:   0%|          | 0/391 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b9aeca32a54e49e997424afa99626e77",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Testing Epoch 4:   0%|          | 0/79 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 5 | train_loss: 0.1766 | train_acc: 0.9376 | test_loss: 0.3410 | test_acc: 0.8874 | train_epoch_time: 170.0547 | test_epoch_time: 10.9239\n"
     ]
    }
   ],
   "source": [
    "# Create model and transforms\n",
    "model, transforms = create_model()\n",
    "model.to(device)\n",
    "\n",
    "# Create loss function and optimizer\n",
    "loss_fn = torch.nn.CrossEntropyLoss()\n",
    "optimizer = torch.optim.Adam(model.parameters(),\n",
    "                             lr=LEARNING_RATE)\n",
    "\n",
    "# Compile the model and time how long it takes\n",
    "compile_start_time = time.time()\n",
    "\n",
    "### New in PyTorch 2.x ###\n",
    "compiled_model = torch.compile(model)\n",
    "##########################\n",
    "\n",
    "compile_end_time = time.time()\n",
    "compile_time = compile_end_time - compile_start_time\n",
    "print(f\"Time to compile: {compile_time} | Note: The first time you compile your model, the first few epochs will be slower than subsequent runs.\")\n",
    "\n",
    "# Train the compiled model\n",
    "single_run_compile_results = train(model=compiled_model,\n",
    "                                   train_dataloader=train_dataloader,\n",
    "                                   test_dataloader=test_dataloader,\n",
    "                                   loss_fn=loss_fn,\n",
    "                                   optimizer=optimizer,\n",
    "                                   epochs=NUM_EPOCHS,\n",
    "                                   device=device)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.3 Compare the results of experiment 1 and 2\n",
    "\n",
    "Nice!\n",
    "\n",
    "We've got two trained models:\n",
    "\n",
    "1. One without `torch.compile()`.\n",
    "2. One with `torch.compile()`.\n",
    "\n",
    "Let's compare the results of each experiment.\n",
    "\n",
    "To do so, we'll first create dataframes of the results of each.\n",
    "\n",
    "Then we'll plot the results of each experiment on a bar chart."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Turn experiment results into dataframes\n",
    "import pandas as pd\n",
    "single_run_no_compile_results_df = pd.DataFrame(single_run_no_compile_results)\n",
    "single_run_compile_results_df = pd.DataFrame(single_run_compile_results)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>train_loss</th>\n",
       "      <th>train_acc</th>\n",
       "      <th>test_loss</th>\n",
       "      <th>test_acc</th>\n",
       "      <th>train_epoch_time</th>\n",
       "      <th>test_epoch_time</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.773435</td>\n",
       "      <td>0.733272</td>\n",
       "      <td>0.802100</td>\n",
       "      <td>0.747725</td>\n",
       "      <td>184.970135</td>\n",
       "      <td>12.989331</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.433699</td>\n",
       "      <td>0.850052</td>\n",
       "      <td>0.479412</td>\n",
       "      <td>0.833762</td>\n",
       "      <td>185.340373</td>\n",
       "      <td>12.951483</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.305494</td>\n",
       "      <td>0.894429</td>\n",
       "      <td>0.428212</td>\n",
       "      <td>0.853343</td>\n",
       "      <td>185.386973</td>\n",
       "      <td>13.055891</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.226751</td>\n",
       "      <td>0.919829</td>\n",
       "      <td>0.438668</td>\n",
       "      <td>0.857991</td>\n",
       "      <td>185.591368</td>\n",
       "      <td>13.049541</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.172269</td>\n",
       "      <td>0.939482</td>\n",
       "      <td>0.390148</td>\n",
       "      <td>0.875396</td>\n",
       "      <td>185.530370</td>\n",
       "      <td>13.051713</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   train_loss  train_acc  test_loss  test_acc  train_epoch_time  \\\n",
       "0    0.773435   0.733272   0.802100  0.747725        184.970135   \n",
       "1    0.433699   0.850052   0.479412  0.833762        185.340373   \n",
       "2    0.305494   0.894429   0.428212  0.853343        185.386973   \n",
       "3    0.226751   0.919829   0.438668  0.857991        185.591368   \n",
       "4    0.172269   0.939482   0.390148  0.875396        185.530370   \n",
       "\n",
       "   test_epoch_time  \n",
       "0        12.989331  \n",
       "1        12.951483  \n",
       "2        13.055891  \n",
       "3        13.049541  \n",
       "4        13.051713  "
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Check out the head of one of the results dataframes\n",
    "single_run_no_compile_results_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Got the results for experiments 1 and 2!\n",
    "\n",
    "Now let's write a function to take in the results and compare them with a bar chart.\n",
    "\n",
    "We'll add some metadata to the function so it can display some information about the experiments.\n",
    "\n",
    "Namely all of the parameters in our experiment setup:\n",
    "* The dataset name.\n",
    "* The model name.\n",
    "* The number of epochs.\n",
    "* The batch size.\n",
    "* The image size."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create filename to save the results\n",
    "DATASET_NAME = \"CIFAR10\"\n",
    "MODEL_NAME = \"ResNet50\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "def plot_mean_epoch_times(non_compiled_results: pd.DataFrame, \n",
    "                          compiled_results: pd.DataFrame, \n",
    "                          multi_runs: bool=False, \n",
    "                          num_runs: int=0, \n",
    "                          save: bool=False, \n",
    "                          save_path: str=\"\",\n",
    "                          dataset_name: str=DATASET_NAME,\n",
    "                          model_name: str=MODEL_NAME,\n",
    "                          num_epochs: int=NUM_EPOCHS,\n",
    "                          image_size: int=IMAGE_SIZE,\n",
    "                          batch_size: int=BATCH_SIZE) -> plt.figure:\n",
    "    \n",
    "    # Get the mean epoch times from the non-compiled models\n",
    "    mean_train_epoch_time = non_compiled_results.train_epoch_time.mean()\n",
    "    mean_test_epoch_time = non_compiled_results.test_epoch_time.mean()\n",
    "    mean_results = [mean_train_epoch_time, mean_test_epoch_time]\n",
    "\n",
    "    # Get the mean epoch times from the compiled models\n",
    "    mean_compile_train_epoch_time = compiled_results.train_epoch_time.mean()\n",
    "    mean_compile_test_epoch_time = compiled_results.test_epoch_time.mean()\n",
    "    mean_compile_results = [mean_compile_train_epoch_time, mean_compile_test_epoch_time]\n",
    "\n",
    "    # Calculate the percentage difference between the mean compile and non-compile train epoch times\n",
    "    train_epoch_time_diff = mean_compile_train_epoch_time - mean_train_epoch_time\n",
    "    train_epoch_time_diff_percent = (train_epoch_time_diff / mean_train_epoch_time) * 100\n",
    "\n",
    "    # Calculate the percentage difference between the mean compile and non-compile test epoch times\n",
    "    test_epoch_time_diff = mean_compile_test_epoch_time - mean_test_epoch_time\n",
    "    test_epoch_time_diff_percent = (test_epoch_time_diff / mean_test_epoch_time) * 100\n",
    "\n",
    "    # Print the mean difference percentages\n",
    "    print(f\"Mean train epoch time difference: {round(train_epoch_time_diff_percent, 3)}% (negative means faster)\")\n",
    "    print(f\"Mean test epoch time difference: {round(test_epoch_time_diff_percent, 3)}% (negative means faster)\")\n",
    "\n",
    "    # Create a bar plot of the mean train and test epoch time for both compiled and non-compiled models\n",
    "    plt.figure(figsize=(10, 7))\n",
    "    width = 0.3\n",
    "    x_indicies = np.arange(len(mean_results))\n",
    "\n",
    "    plt.bar(x=x_indicies, height=mean_results, width=width, label=\"non_compiled_results\")\n",
    "    plt.bar(x=x_indicies + width, height=mean_compile_results, width=width, label=\"compiled_results\")\n",
    "    plt.xticks(x_indicies + width / 2, (\"Train Epoch\", \"Test Epoch\"))\n",
    "    plt.ylabel(\"Mean epoch time (seconds, lower is better)\")\n",
    "\n",
    "    # Create the title based on the parameters passed to the function\n",
    "    if multi_runs:\n",
    "        plt.suptitle(\"Multiple run results\")\n",
    "        plt.title(f\"GPU: {gpu_name} | Epochs: {num_epochs} ({num_runs} runs) | Data: {dataset_name} | Model: {model_name} | Image size: {image_size} | Batch size: {batch_size}\")\n",
    "    else:\n",
    "        plt.suptitle(\"Single run results\")\n",
    "        plt.title(f\"GPU: {gpu_name} | Epochs: {num_epochs} | Data: {dataset_name} | Model: {model_name} | Image size: {image_size} | Batch size: {batch_size}\")\n",
    "    plt.legend();\n",
    "\n",
    "    # Save the figure\n",
    "    if save:\n",
    "        assert save_path != \"\", \"Please specify a save path to save the model figure to via the save_path parameter.\"\n",
    "        plt.savefig(save_path)\n",
    "        print(f\"[INFO] Plot saved to {save_path}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot function ready!\n",
    "\n",
    "Let's create a directory to store our figures in and then plot the results of our first two experiments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] Save path for single run results: pytorch_2_results/figures/single_run_NVIDIA_TITAN_RTX_ResNet50_CIFAR10_224_train_epoch_time.png\n",
      "Mean train epoch time difference: -5.364% (negative means faster)\n",
      "Mean test epoch time difference: -0.02% (negative means faster)\n",
      "[INFO] Plot saved to pytorch_2_results/figures/single_run_NVIDIA_TITAN_RTX_ResNet50_CIFAR10_224_train_epoch_time.png\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAokAAAHOCAYAAAD0YpNoAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAABJT0lEQVR4nO3dd5hcZdn48e9NEgi9RqUnIKFJCBB4QVqQJkUEBAFDCUh7BSk2ig18QREQBUUi/KQpVUAFQYx0VBATCCX0EiD0FggtknD//jhnw2Rny9nsTmZJvp/r2mtnTnmee2bOnLnnKWciM5EkSZJqzdXsACRJktT7mCRKkiSpjkmiJEmS6pgkSpIkqY5JoiRJkuqYJEqSJKmOSaKkbomIERExuofKuiUi9u+Jsj6uImJCRGzR7DgkySRRUqciYqOI+FdEvBkRr0fEPyNiXYDMvCgzt2p2jLOjiDguIn7f7DgkzZn6NjsASb1bRCwE/AX4X+ByYG5gY2BKM+PqSET0zcypH9fyJak3sCVRUmcGA2TmJZk5LTPfy8zRmXkfQESMjIh/tGwcERkRB0fEYxHxRkScGRFRrusTET+LiFcj4qmIOLTcvs0vrBGxX0Q8VJbzt4hYvp3tBpblfDUingFuiojhETGx1XbTu3LLVrrLI+LCiJgcEeMjYlh7T0JZ/iER8RjwWLls+4gYFxGTypbWITXbHxURz5VlPxIRm5fLz4+IE2q2q4uzXP554Fhgt4h4OyLurXm+nyzLfSoiRrQXsyR1h0mipM48CkyLiAsiYpuIWLTCPtsD6wJrAl8Gti6XHwBsAwwF1gZ2bK+AiNiRIknaGRgA3A5c0km9mwKr1tTXmR2AS4FFgKuBX3Wy/Y7A/wCrRcTawLnAQcDiwG+AqyNinohYGTgUWDczFyzjmVAxJgAy83rgx8BlmblAZq4ZEfMDZwDblOV+FhjXlXIlqSqTREkdysy3gI2ABM4BXomIqyPikx3sdlJmTsrMZ4CbKZJCKBLG0zNzYma+AZzUQRkHAT/JzIfKrt0fA0Pba00sHZeZ72Tme9UeHf/IzOsycxrwO4qktiM/yczXy/IPAH6Tmf8uW1gvoOiCXx+YBsxDkUz2y8wJmflExZg68yHwmYiYNzNfyMzxPVSuJM3AJFFSp8pEbWRmLgN8BlgK+EUHu7xYc/tdYIHy9lLAszXram+3tjxwetmVOwl4HQhg6Q726ai8KnH2b6/ru43ylwe+2RJfGeOywFKZ+ThwBHAc8HJEXBoRS3UxtjqZ+Q6wG3Aw8EJEXBsRq3S3XElqi0mipC7JzIeB8ymSxa56AVim5v6yHWz7LHBQZi5S8zdvZv6ro/Bqbr8DzNdyJyL6UHRbd0dt+c8CJ7aKb77MvAQgMy/OzI0okskEftpWXMCnKtZHWe7fMnNLYEngYYrWXUnqcSaJkjoUEatExDcjYpny/rLAHsCdM1Hc5cDhEbF0RCwCHNXBtqOAYyJi9bLehSNi1y7U9ShFy+B2EdEP+B5FF3BPOQc4OCL+Jwrzl3UtGBErR8TnImIe4H3gPYouaCjGEG4bEYtFxKcoWhzb8xIwMCLmAoiIT0bEDuXYxCnA2zXlSlKPMkmU1JnJFJM1/h0R71Akhw8A35yJss4BRgP3AfcA1wFTaSPRycw/UrS+XRoRb5V1blO1osx8E/ga8P+A5yha8OpmEc+szBxDMS7xV8AbwOPAyHL1PBTjLV+l6NL+BMUkHCjGPt5LMZFlNHBZB9X8ofz/WkTcTXHO/ibwPEX3+6YUj1GSelxk1vVmSNIsERHbAKMys6PJKJKkJrAlUdIsExHzRsS2EdE3IpYGfgj8sdlxSZLq2ZIoaZaJiPmAW4FVKMbpXQscXl5mR5LUi5gkSpIkqY7dzZIkSapjkihJkqQ6JomSJEmqY5IoSZKkOiaJkiRJqmOSKEmSpDomiZIkSapjkihJkqQ6JomSJEmqY5IoSZKkOiaJkiRJqmOSKEmSpDomiZIkSapjkihJkqQ6JomSJEmqY5IoSZKkOiaJkiRJqmOSKEmSpDomiZIkSapjkihJkqQ6JomSJEmqY5KoWSYiJkTEwGbH0SIijouI3/eCOG6JiOHNjmNO14zjMyLOj4gTKm47ISK2aHRMH2fl8zmy2XE0QkQsFxFvR0SfZsfSEyJiYERMaHYc0LX3YSfl/DUi9umJmHqLLieJEbF7RPw7It6JiJfL21+LiCjXnx8R/y0P5tcj4u8RsUrNuhNalTcwIjIi+lasPyPi/oiYq2bZCWXZ/SNiUkR8ro39fh4RV5S3p59sI2JkREwr4307Ip6KiPMiYnBnMZZJRkbEeh3EO6qm7P9GxAc19/9aW3Z5v2XdBzXP49sRMaosb/7y/nVt1DUhIl6KiPlrlu0fEbe0E1tL3S11TIiIo8t142uWT4uI92vuHxsRX4+IByJi7pryjoiIe6q+lq1iaf06tPwt1dWymq08Lj5o9ThWmMmyhkfEhzXlTIyIyyNi3S7G06PJcEQsGRG/jYgXImJyRDwcEce3HHvlcfXpmvpbPx/fqSnrloh4IyLmaVVHu+eSmhiujojny/oGttp/nog4NyLeiogXI+Ib3Xi8I8s6Tmu1fMdy+fkzW3YjdPbczWSZLeeLa1st/31EHFexjLpEtyzznZpj4/+1Wn9k+fq9Wb6e8zATooufNb1NZj6TmQtk5rRG1RER65fHyusR8UpE/CEilqxZ/+3yvD85is/Kb7dTzqblcz3TiVerY3hyRIyNiE27sP8s/1KVmdtk5gWNrCMiDo2IMRExpfV5p8LrN08UOclL5TbXRMTSHdXXpSQxIr4JnA6cAnwK+CRwMLAhMHfNpidn5gLAMsDLwPn0rKWA3VsvzMz3gcuAvVvF3QfYA2jvxbujjHdhYAvgPWBsRHymvQAiIoC9gNeBdr85ZObB5Rt7AeDHwGUt9zNzm1bbblOz7UWUz2P5d3C52S7AFGCr2he/Rl/g8PbiacciZZ27AN+PiC0zc/WaWG4HDq2J5cfAmcAk4Lvl87ECcDzw1cyc2sX6W9xRU0fL3/MzWVazXdbqcTzZjbKeL1+HBYH1gYeB2yNi8x6JtIsiYjHgDmBeYIPMXBDYElgEWLGd3Vo/HyeXZQ0ENgYS2KGN/VrOJUsDzwG/rVn3IXA98KV26jwOWAlYHtgM+E5EfL7iw2zLE8BurZKMvYFHu1FmI3X03HXH+hGxYQ+V1WLNmmNj/5aFEbE1cDSwOTAQaDnPqDEWBc6meK6XByYD59WsD4pjflHg88ChETHDZ3FE9KPIE/7dA/G0HMMLA2cBV8Vs0pLaDc8DJwDntrGus9fvcGADYAhFHjUJ+GVHlVVOEiNiYeBHwNcy84rMnJyFezJzRGZOab1PZr4LXAy0m2zNpJOB49v5RngB8KWImK9m2dYUj/WvHRWamdMy84nM/BpwK8WHTHs2pniSDwd2j5oWtQbbBxgF3AeMaGP9KcC3ImKRrhacmWOA8cDQCtt+CHwVODIihgDnAL/OzLu7Wm8V5bfCYyLiwbLV6byI6F+z/oCIeLz8dnR11LRARsTqNd+uXoqIY2uKnjsiLiy/qY6PiGE1+x0VEc+V6x5pVlLWony/TczMHwD/D/hpy7qIOD0ino2i1WxsRGxcLv88cCxFcvN2RNxbLt83Ih4qH9uTEXFQF0L5BsXJZ8/MnFDG9mxmHp6Z93XxYe0N3EnxRbKjL1vvAZdTc2xm5kuZ+WvgPx2U/X+Z+UZmPkRxjI7sYny1XgTupziftCTLnwWurt0oInYoj6VJUbSSrlqzbq2IuLt83i8D+rfad/uIGFfu+6/yvdUtbT13EbFURFxZtjY8FRGH1axbr2ypeKt8v5zWqsiTKT6k2tTeY4iI3wHLAddEq9bkDuwD/DYzx2fmG8D/0b3XsDbO8yPi1/FRD84/I+JTEfGL8hzzcESsVbP90RHxRPnaPRgRO9Ws6xMRP4uIV8vn89CoabWMiIXjo5b356Lo/Woz2Wnv+Y8Ze502iBlb5t+Psus2IuaqifW1KHoeFqvynGTmXzPzD5n5Vvn5/SuKRqCW9Sdn5t2ZOTUzHwH+XLu+9E1gNMWX2R5Rft5cDCxG0ThFRKwYETeVj/HViLgoys+99o61iNioPCYnlefLkTXVLBoR15av778jos0vvFH0WP6+rHdSRPwnIlpiuiUi9i9v39vqNcoohxVF0eLXEse90YXhRpl5VWb+CXitjXUdvn7AIOBv5bnzfeBSYPWO6utKS+IGwDwUB0UlEbEARSJzTxf2+XVE/LqTza4C3qKNk0Vm/gt4Adi5ZvFewMVdbOG6iiIRbM8+wDUULZcA23eh7JkSEcsBwylaGS+iVYtpaQxwC/CtmSh/fYqE/vEq25cniZ8AN1G0Gjf6G/4Iig/oFYHBwPcAohhe8BPgy8CSwNMUBz8RsSBwA0WL01LAp4Eba8rcodx2EYoP+1+V+60MHAqsW7aUbQ1MKNdtFBGTOon1C1EkpeMj4n+78ZjbcxWwdnw0tOA/FEnAYhQn0z9ERP/MvJ4ZW7DXLLd/meKYXQjYF/h5RKzdUnh58tqonbq3AK4qT9zdtTcfHc9bt5xsWysf5x5UPDYjYlGK1/vemsX30skJsYIL+eh9tzvF+XD6F+QohqlcAhwBDACuo/igmjuKL5J/An5H8Tr9gZpW0PL5Pxc4CFgc+A1wdbTRvVrxGGzZdobnLoqhOtdQPB9LU7TSHRFFqx0UrUCnZ+ZCFO+1y1sVeSYwONroyuvoMWTmXsAzwBdqW5NLt0XRpXxVzDhsYHXqX8NPRsTiVR57BV+mOI8sQfE63gHcXd6/AqhNkJ+g+ExYmOJc9/v4qDfnAGAbivfg2sCOreq5AJhKcf5ZC9gK2J+2dfb8k5nTe10oWo/upDjuAA4r69+U4j3wBsVrBkBE3BcRX2mn7tY2oWg4qBMRQfF8jK9ZtjywH0WDUo8pE+q9gaeAl1oWU5z3lwJWBZalbNhp61grPz//StFyNoDitRpXU80eFK/rohTvlRPbCWcfimNgWYpj/GCK3scZZOaaNa/RN4BHgLuj6N69luKL1mIUn9VXRsSA8rEeHRF/qf7sdKj16/dbYMMoviTOR/GZ2mHjGZlZ6Q/YE3ix1bJ/UTRXvgdsUi47H3i/XP4ixQfvijXrTmhVxkCKrqa+FeNIijfathQHwTwUT/b5Ndt8Dxhd3l4IeBdYq2b9BGCL8vZI4B9t1PN54IO2YgTmo0hSdyzv/wb4c4XYjwN+X+Xxt/NcfQ8YV95eCpjW1uOiSPTepHgj7A/c0k48LXW3vIYJnApEq+1uAfZvp4yNyv1OrPD4JwAD21k3kuIkOqnm74lW+x5cc3/blvUUB/7JNesWAD4oH98ewD0dvB431NxfDXivvP1pikRqC6Bf1fdJTTlLAX0oWppeAPboYPtbgOHtrBsOTGxj+Srl8750O/u9QdGF1+Zx18b2fwIOr/j4Hqt9LdrZJoFP19T/31av7VLlsfMBsES53cPAka3eAy3nkg8pPiCGtFFX37K+gTXLli2X9a9ZtiUwoRvH5z8outhfoviQuJPiW/r08w/wfeDymv3moujqHU5xwn6emvcXxTn0hPL2WRQtn7X1PgJsWhPfFhVfo3afO+B/gGdabX8McF55+zaKD8slWm0zsHxO+wJfA+4sl/8eOG5mH0P5vMxN8UXtV8ADfHSufQL4fM22/Vq/1m087pHtrJsef82259Ss/zrwUM39NYBJHTzH44AvlrdvAg6qWbdFzXP1SYoEdN6a9XsAN7dTbqfPf6vlZ1EkHXOV9x8CNq9ZvyTF+6zSZ2zNfkMohlNt3M764ymS9nlqlv0Z2K3m+T2hg/IH0vH78Xw+OobfL/9GdLD9jtSc61sfaxTH+B87qOv/1dzfFni4nW33o3jftnUuuoVWn5UU57mXgcHl/aOA37Xa5m/APl18fWbIe6q8fhT50CXlcTSVogFvsY7q6UpL4mvAElHTxZuZn83MRcp1tWWdmpmLZOanMnOHzHyiXD6V4k1eqx/FSaxLrRKZeR1FknhgG6svBDYrM/ZdgMcz856ulE/xDfv1dtbtRPFYWiaPXARs0/JNoIFaWl3IYqzerbTRRZeZDwB/oRjLU8USFInVtyg+zFq/Rm0qW0Z+Q/HN7NCYyckZNe4sj5uWv9bN/c/W3H6aItGg/P90y4rMfJvimFyaIll4gva9WHP7XaB/RPTNzMcpWoOOA16OiEuj4iSazHwwM5/PYvjCvyhaBnapsm8XLM1HCT4R8c0ouo/fLFuYFqZ4XdsUEdtExJ1la+ckipNiu9u38hrFB09XXN7qtX2e4tgdnZmvlttcTP3xfGp5jhlI8UVm5Yr1vV3+X6hm2UIU3eQzLYuu22spW58y85+tNml9LH5IcdwuXa57LsuzdenpmtvLA98sW3Enla/Lsnx0nHdVe8/d8sBSreo5lrIbj2IYyWDg4bIrra1eknMoWvS+0Gp5lx9DZt6Wmf/NzEkUw3cGUbQMQfE6tn4NoZuvY42Xam6/18b9BVruRMTe8VE3+iSKL+Mt75mlmPH8VHt7eYpz6gs1+/4G+EQ7MVV5/ltiOojinP2V/Khlf3ngjzV1PUTRoNBmK3075X6aooXp8My8vY31h1J8Hm2X5VCz8lhYMDMva719N7Qcw/MCw4BTImKbsr5PlOfl5yLiLYovKx2dw7r6WbBAO9v9jiKpuzSKSXMnRzEOs05ELEvRErxPZraMXV4e2LXVe2Qjun5ObVcHr99ZFENcFgfmp+iR6rAlsStJ4h0U34a+2KVoZ/QMxQmr1iDg2Zy5rqvvUUycqB1/SGY+QzHZYgRFV/OFM1H2TmUZbdmH4gB6JiJepOg26kfx7bAhIuKzFIPwjym7ZV6kaBHYI9oem/lDii6QDmcutSgTmp9RfFv7WsWwvk/xDelwinGSv6m438xatub2chStMpT/l29ZUXavLU7RgvMs7U+m6FBmXpyZG5VlJzVjALtaFEXXSE/aCbg7M9+JYvzhURRdZ4uWJ9U3a+qsTUoouy+vpGg1/mS5/XVdiPEGYKeoucJAV0XEvGW8m9Ycz0cCa0bEmq23L9/ThwOnl/t2KIvxay8AtWWtSTtdZ110IcW4q9+1sa71sRgUx+1zZTxLl8taLFdz+1mKFvnaZHq+zLyEbmjjuXsWeKpVPQtm5rbl9o9l5h4UScxPgStqhjW0lPkBRUvS/zHjcdPZY5jhWGwv5Joyx1P/Gr6UmXXjsRqp7EY9h2IIyuLle+aBmjhfoBhy06L2XPUsxWfnEjXPyUKZ2ebQhyrPfxnTxhTP/xcz881W9W3T6jXon5nPdeGx3kDRIlx3jEfEfpSTiTJzYs2qzYFhNe/n3SiGMVQeotaeLDwA/BPYrlz8E4pjZUgWXfN7MuOx2PpYm+nPglaxfJCZx2fmahQ9RdvTxtCv8r32J+AXmVmbiD1L0ZJY+/rMn5kndTe2st6OXr81KVofXy+T+18C60VEu8l15ZN8+S3veODXEbFLRCwQxQDZoRQZaRVXAttFxFZRDPRdiiLRu7RqHK1iuoViIHldaxrFGJBDKbqDLqpSXhnToIj4JcW3s7oxdmXr5OYUB8bQ8m9NijdzW3H0lH2Av1N0ZbbU+xmKBHmb1huXLWGXUYxP6YqTKGaB9u9oo/KD/DDggLJl5DhgYETs28X6uuKQiFgmikHYx/LReNCLgX0jYmiZAP0Y+HcWkyr+AnwqisvzzBMRC0bE/3RWUUSsHBGfK8t7n6JVodKlJyLiixGxaBTWo3ieun2iLMtbOiJ+SDGMoGUCzoIULduvAH0j4gfM2PryEsVr0/J+n5timMYrwNTym/lWXQjltLL8C8oTEmVcp0X1iRY7UjyftcfzqhRfzNoaa0tm/p0iCZvee1Aepy1j9uZpddxeCHyvfC1WofjSdH7F+DpyK0XXdVuzAi+nOMdtXrYufJMiQfgXxRftqcBhUUw+2BmovXzWOcDBEfE/5Ws9f0RsF8W42m5p9dzdBbwVxcSsecvz3meivKxSROwZEQPKL+6TyiLaOvZ/R/Hc184Y7+wxvEQxQ5myrtXL922fKMaw/4wioX6o3ORC4KsRsVoU40y/R89fLaOK+SmSjlegmPjFjBMyLwcOL98Hi1B8aQMgM1+gmMjxs4hYqPzcXDHauZxLlec/ihaqy4C9a1qoWowCTqx5bw6IiEqNO+Xn203AmZk5qo31IyjOr1tm/RUbvk/RAjq0/Lua4njokc+E8j28ER990VuQoqV5Uhl368vxzHCsUeQBW0TEl8v33+Jl/tLVODaLiDWiGCf5FkVXflvvj3MpuqxPbrX89xRj1rcuj/v+UVzqbJk2ymir/r7lea4P0LJ/ywSpDl8/irHre0cxkaofRYPQ8zW9OXW61BJQPthvAN+haEF6iaL16CiKk2Bn+4+naG37CUVX7h0U0+SnJ2NRXMOnrQfXnu9RDP5s7QqKAag3lm/SjmwQEW9TvOC3UHwArpuZ97ex7V4U4wJHZ+aLLX/AGcCQ6OCyOTOrPCC+DPyyts7MfIriRN1ecvojqifwLa6lGM92QAfx9KEYB3himYy2dMMdQNEdULlbo5XWM/bejhmvB3gxxcn2yfLvhLLuGylOUFdSfKNfkfISSZk5meID/QsU3QmPAZtViGUeioT51XK/T1AmZRGxcXm8tGd3ioHPkyk+5H6a3bt21lJlfW9TvMnXoBjDOLpc/zeKLoNHKbov32fG7q4/lP9fi4i7y+fkMIoPtjeAr1A/Q/ftKGdIt5aZr1N8g/4A+HdETKaYDPQmFSeWUByz52Vx7bfa99GvgBHR/rXsTqH4EtOSGL7HR13LDzPjAPIfUnQvPU2R2J2SxUSebilbNW4sn4fW6x6haNH4JcWx8wWKwfP/zcz/UkyoG0nxvO9G0d3Tsu8YivfQr8r1j9POTN4Kx2BbTqE4d/ct4xpKMVbxVYrZ8guX230eGF+WfzqwexYzIVs/1mkUz/FiNcs6eww/oUjcJ0XEtyi6QC+jOPc+SdHTtH3ZUkn5ep0M3EzxOj5d1jlLZeaDFAnsHRSfe2tQtGq1OIfi3HQfxTiv6yi+ELQkD3tTfDl7kOJ5uYL2uxerPP+bU1yG7oqac2VL8nQ6xft5dPnevJOi1wmAKCbTtXVlDCi+fK4A/LD2PFyz/gSKXpr/1KwfVT5Hk1u9l98D3mnrfdIF3ynreIfi+T2Pj3qsjqeYJPQmxefWVa32neFYK1vUt6X44vY6xZjSul6LCj5F8fq9RfFl5laKxK+13Sl6XGo/zzbOzGcpemSPpfjS8SxFgjsXQBTXIe6oC/h7FM/t0RTnmvfKZdD56/ctis+Hx8q6t6XolWpX5AzDY6TGieISDcPLFr6Z2Xf/zLyhh8Nquigudn5c2TKuJunO8aneIYqLC9+Smec3OY5tgFGZuXynG8+hopjFfktmDmxyKOqAP8snSVI3lN3225ZdgUtTtHb+sdlxSd1lkqhZ6Rd8NMZGHzmf8hqMaqpf4PH5cfcnZrz23awSFN2fb1B0Nz8E/KAJcXycTKJ4z6kXs7tZkiRJdWxJlCRJUp32ZhB+LCyxxBI5cODAZochSZLUqbFjx76amY3+4Y0e87FOEgcOHMiYMWOaHYYkSVKnIuLpzrfqPexuliRJUh2TREmSJNUxSZQkSVKdj/WYREmSGuWDDz5g4sSJvP9+3a8SSh3q378/yyyzDP369Wt2KN1ikihJUhsmTpzIggsuyMCBA4mIZoejj4nM5LXXXmPixIkMGjSo2eF0i93NkiS14f3332fxxRc3QVSXRASLL774bNECbZIoSVI7TBA1M2aX48YkUZIkSXUckyhJUgUDj762R8ubcNJ2PVqe1NNsSZQkSQ237bbbMmnSJAAWWGCBLu173HHHceqppzYgqo61xDlhwgQuvvjiWV5/s5kkSpKkhrvuuutYZJFFGl7P1KlTe7xMk0RJktSrTJgwgVVXXZUDDjiA1Vdfna222or33nuPcePGsf766zNkyBB22mkn3njjDQCGDx/OUUcdxXrrrcfgwYO5/fbb2y172rRpfOtb32KNNdZgyJAh/PKXvwTgxhtvZK211mKNNdZgv/32Y8qUKQAMHDiQY489lg022IBhw4Zx9913s/XWW7PiiisyatQoAG655RY22WQTdtppJ1ZbbTUOPvhgPvzww+n7v/rqq3VxnHLKKay77roMGTKEH/7wh9OXn3jiiay88spsscUWPPLIIx0+T8OHD+fYY49l00035fTTT2fs2LFsuummrLPOOmy99da88MILAJxxxhmsttpqDBkyhN133x2ob6X8zGc+w4QJE2Yo/+ijj+b2229n6NCh/PznP2f8+PGst956DB06lCFDhvDYY491GN/HlUmiJEm92GOPPcYhhxzC+PHjWWSRRbjyyivZe++9+elPf8p9993HGmuswfHHHz99+6lTp3LXXXfxi1/8YoblrZ199tk89dRT3HPPPdx3332MGDGC999/n5EjR3LZZZdx//33M3XqVM4666zp+yy77LLccccdbLzxxowcOZIrrriCO++8kx/84AfTt7nrrrv42c9+xv33388TTzzBVVdd1W4Mo0eP5rHHHuOuu+5i3LhxjB07lttuu42xY8dy6aWXcs8993DVVVfxn//8p9PnadKkSdx6660cdthhfP3rX+eKK65g7Nix7Lfffnz3u98F4KSTTpr+eFsS2ypOOukkNt54Y8aNG8eRRx7JqFGjOPzwwxk3bhxjxoxhmWWWqVzWx4kTVyRJ6sUGDRrE0KFDAVhnnXV44oknmDRpEptuuikA++yzD7vuuuv07Xfeeefp27ZuEat1ww03cPDBB9O3b5EKLLbYYtx7770MGjSIwYMHTy/7zDPP5IgjjgBghx12AGCNNdbg7bffZsEFF2TBBRekf//+08cbrrfeeqywwgoA7LHHHvzjH/9gl112aTOG0aNHM3r0aNZaay0A3n77bR577DEmT57MTjvtxHzzzTdDvR3ZbbfdAHjkkUd44IEH2HLLLYGixXTJJZcEYMiQIYwYMYIdd9yRHXfcsdMy27PBBhtw4oknMnHiRHbeeWdWWmmlmS6rN7MlUZKkXmyeeeaZfrtPnz7Tk7HOtu/Tp0+H4/Mys+56fplZqey55pprhrjmmmuu6XW1LrOjawZmJscccwzjxo1j3LhxPP7443z1q1/tdL+2zD///NPLXH311aeXef/99zN69GgArr32Wg455BDGjh3LOuusw9SpU+nbt+/0LnGg0kWwv/KVr3D11Vcz77zzsvXWW3PTTTd1KdaPC1sSJUmqoLdcsmbhhRdm0UUX5fbbb2fjjTfmd7/73fRWxa7YaqutGDVqFMOHD6dv3768/vrrrLLKKkyYMIHHH3+cT3/60zNV9l133cVTTz3F8ssvz2WXXcaBBx7Y7rZbb7013//+9xkxYgQLLLAAzz33HP369WOTTTZh5MiRHH300UydOpVrrrmGgw46qFL9K6+8Mq+88gp33HEHG2ywAR988AGPPvooq666Ks8++yybbbYZG220ERdffDFvv/02AwcO5C9/+QsAd999N0899VRdmQsuuCCTJ0+efv/JJ59khRVW4LDDDuPJJ5/kvvvu43Of+1yXnqePA5PECnr62liqrreclCWpN7ngggs4+OCDeffdd1lhhRU477zzulzG/vvvz6OPPsqQIUPo168fBxxwAIceeijnnXceu+66K1OnTmXdddfl4IMP7lK5G2ywAUcffTT333//9Eks7dlqq6146KGH2GCDDYDikjO///3vWXvttdltt90YOnQoyy+/PBtvvHHl+ueee26uuOIKDjvsMN58802mTp3KEUccweDBg9lzzz158803yUyOPPJIFllkEb70pS9x4YUXMnToUNZdd93pXe21hgwZQt++fVlzzTUZOXIk77//Pr///e/p168fn/rUp2YYkzk7ic6alnuzYcOG5ZgxYxpej0li85gkSmqWhx56iFVXXbXZYXys3HLLLZx66qnTW+bmZG0dPxExNjOHNSmkLnNMoiRJkurY3SxJ0mzsb3/7G0cdddQMywYNGsQf//jHHq9r+PDhDB8+vMfLbXHIIYfwz3/+c4Zlhx9+OPvuu2/D6pyTmSRKkjQb23rrrdl6662bHUaPOPPMM5sdwhzF7mZJkiTVMUmUJElSHZNESZIk1XFMoiRJVRy3cA+X92bPltdNP/jBD9hkk03YYostGD58OKeeeirDhlW7WkuzLn1TG+ePf/xjjj322Fla/+zOlkRJksSPfvQjtthii4bX09FPBXbHj3/844aUOyczSZQkqRe78MILGTJkCGuuuSZ77bUXTz/9NJtvvjlDhgxh880355lnngFg5MiR/O///i+bbbYZK6ywArfeeiv77bcfq666KiNHjpxe3gILLMA3v/lN1l57bTbffHNeeeWV6ftfccUVdfWPHj2aDTbYgLXXXptdd92Vt99+G4Drr7+eVVZZhY022oirrrqqw8dw3HHHceCBB7LVVlux995788orr/ClL32Jddddl3XXXXf6ZW1uvfVWhg4dytChQ1lrrbWYPHkyt9xyC9tvv/30sg499FDOP//8Gco/+uijee+99xg6dCgjRozgnXfeYbvttmPNNdfkM5/5DJdddlmXn3eZJEqS1GuNHz+eE088kZtuuol7772X008/nUMPPZS9996b++67jxEjRnDYYYdN3/6NN97gpptu4uc//zlf+MIXOPLIIxk/fjz3338/48aNA+Cdd95h7bXX5u6772bTTTfl+OOPb7f+V199lRNOOIEbbriBu+++m2HDhnHaaafx/vvvc8ABB3DNNddw++238+KLL3b6WMaOHcuf//xnLr74Yg4//HCOPPJI/vOf/3DllVey//77A3Dqqady5plnMm7cOG6//XbmnXfeSs/TSSedxLzzzsu4ceO46KKLuP7661lqqaW49957eeCBB/j85z9fqRzNyCRRkqRe6qabbmKXXXZhiSWWAGCxxRbjjjvu4Ctf+QoAe+21F//4xz+mb/+FL3yBiGCNNdbgk5/8JGussQZzzTUXq6++OhMmTABgrrnmYrfddgNgzz33nGH/1u68804efPBBNtxwQ4YOHcoFF1zA008/zcMPP8ygQYNYaaWViAj23HPPTh/LDjvsMD3pu+GGGzj00EMZOnQoO+ywA2+99RaTJ09mww035Bvf+AZnnHEGkyZNom/fmZs6scYaa3DDDTdw1FFHcfvtt7Pwwj08nnQO4cQVSZJ6qcwkIjrcpnb9PPPMAxSJYMvtlvvtjQXsqPzMZMstt+SSSy6ZYfm4ceM6jau1+eeff/rtDz/8kDvuuKOupfDoo49mu+2247rrrmP99dfnhhtuoG/fvnz44YfTt3n//fc7rWvw4MGMHTuW6667jmOOOYatttqKH/zgB12KV7YkSpLUa22++eZcfvnlvPbaawC8/vrrfPazn+XSSy8F4KKLLmKjjTbqUpkffvjh9LGHF198cYf7r7/++vzzn//k8ccfB+Ddd9/l0UcfZZVVVuGpp57iiSeeAKhLIjuz1VZb8atf/Wr6/Zau8CeeeII11liDo446imHDhvHwww+z/PLL8+CDDzJlyhTefPNNbrzxxjbL7NevHx988AEAzz//PPPNNx977rkn3/rWt7j77ru7FJ8KtiRKklRFEy5Zs/rqq/Pd736XTTfdlD59+rDWWmtxxhlnsN9++3HKKacwYMAAzjvvvC6VOf/88zN+/HjWWWcdFl544Q4ndQwYMIDzzz+fPfbYgylTpgBwwgknMHjwYM4++2y22247llhiCTbaaCMeeOCByjGcccYZHHLIIQwZMoSpU6eyySabMGrUKH7xi19w880306dPH1ZbbTW22WYb5plnHr785S8zZMgQVlppJdZaa602yzzwwAMZMmQIa6+9NnvvvTff/va3mWuuuejXrx9nnXVWl54jFSIzmx3DTBs2bFiOGTOm4fUMPPrahtehtk04abtmhyBpDvXQQw+x6qqrNjuMHrfAAgtMn6Gsxmnr+ImIsZlZ7eKTvUDDupsj4tyIeDkiHqhZdllEjCv/JkTEuHL5wIh4r2bdqEbFJUmSpM41srv5fOBXwIUtCzJzt5bbEfEzoLbt/onMHNrAeCRJmuM1shXxvPPO4/TTT59h2YYbbsiZZ57ZsDrVOA1LEjPztogY2Na6KKZEfRn4XKPqlyRJs9a+++7Lvvvu2+ww1EOaNXFlY+ClzHysZtmgiLgHeAv4Xmbe3pzQ1Kv09G+lqrpe9ruyUjNUuQSN1NrHeb5HrWZdAmcPoHa+/AvAcpm5FvAN4OKIWKitHSPiwIgYExFjWn5KSJKknta/f39ee+212eYDX7NGZvLaa6/Rv3//ZofSbbO8JTEi+gI7A+u0LMvMKcCU8vbYiHgCGAzUTV3OzLOBs6GY3TwrYpYkzXmWWWYZJk6ciA0S6qr+/fuzzDLLNDuMbmtGd/MWwMOZObFlQUQMAF7PzGkRsQKwEvBkE2KTJAkoLs48aNCgZochNU0jL4FzCXAHsHJETIyIr5ardmfGrmaATYD7IuJe4Arg4Mx8vVGxSZIkqWONnN28RzvLR7ax7ErgykbFIkmSpK7xt5slSZJUxyRRkiRJdUwSJUmSVMckUZIkSXVMEiVJklTHJFGSJEl1TBIlSZJUxyRRkiRJdUwSJUmSVMckUZIkSXVMEiVJklTHJFGSJEl1TBIlSZJUxyRRkiRJdUwSJUmSVMckUZIkSXVMEiVJklTHJFGSJEl1TBIlSZJUxyRRkiRJdUwSJUmSVMckUZIkSXVMEiVJklTHJFGSJEl1TBIlSZJUxyRRkiRJdUwSJUmSVMckUZIkSXVMEiVJklTHJFGSJEl1TBIlSZJUxyRRkiRJdUwSJUmSVMckUZIkSXVMEiVJklTHJFGSJEl1TBIlSZJUxyRRkiRJdUwSJUmSVMckUZIkSXU6TRIj4hMRsVNEHBIR+0XEehFRZb9zI+LliHigZtlxEfFcRIwr/7atWXdMRDweEY9ExNYz/5AkSZLUXX3bWxERmwFHA4sB9wAvA/2BHYEVI+IK4GeZ+VY7RZwP/Aq4sNXyn2fmqa3qWg3YHVgdWAq4ISIGZ+a0rj4gSZIkdV+7SSKwLXBAZj7TekVE9AW2B7YErmxr58y8LSIGVozji8ClmTkFeCoiHgfWA+6ouL8kSZJ6ULvdxpn5bWBiRHy5jXVTM/NPmdlmgtiJQyPivrI7etFy2dLAszXbTCyXSZIkqQk6HFuYmR8CX+/B+s4CVgSGAi8APyuXR1vVt1VARBwYEWMiYswrr7zSg6FJkiSpRZXZzaMj4lsRsWxELNbyNzOVZeZLmTmtTD7PoehShqLlcNmaTZcBnm+njLMzc1hmDhswYMDMhCFJkqROdDQmscV+5f9DapYlsEJXK4uIJTPzhfLuTkDLzOergYsj4jSKiSsrAXd1tXxJkiT1jE6TxMwcNDMFR8QlwHBgiYiYCPwQGB4RQymSzAnAQWUd4yPicuBBYCpwiDObJUmSmqfTJDEi5gO+ASyXmQdGxErAypn5l472y8w92lj82w62PxE4sbN4JEmS1HhVxiSeB/wX+Gx5fyJwQsMikiRJUtNVSRJXzMyTgQ8AMvM92p6NLEmSpNlElSTxvxExL+UlaSJiRWBKQ6OSJElSU1WZ3XwccD2wbERcBGwI7NvIoCRJktRcVWY3j46IscD6FN3Mh2fmqw2PTJIkSU3TaXdzRNyYma9l5rWZ+ZfMfDUibpwVwUmSJKk52m1JjIj+wHwU1zlclI8mqyxEccFrSZIkzaY66m4+CDiCIiEcy0dJ4lvAmY0NS5IkSc3UbpKYmacDp0fEYZl5Ru26iJin4ZFJkiSpaapcAmdkG8vu6OE4JEmS1It0NCbxU8DSwLwRsRYzjkmcbxbEJkmSpCbpaEzi1hStiMsAp9Usfws4toExSZIkqck6GpN4AXBBRHwpM6+chTFJkiSpyaqMSfxnRPw2Iv4KEBGrRcRXGxyXJEmSmqhKknge8Dc+ujbioxSXxpEkSdJsqkqSuERmXg58CJCZU4FpDY1KkiRJTVUlSXwnIhYHEiAi1gfebGhUkiRJaqqOZje3+AZwNbBiRPwTGADs0tCoJEmS1FSdJomZeXdEbAqsTHGtxEcy84OGRyZJkqSm6TRJjIj+wNeAjSi6nG+PiFGZ+X6jg5MkSVJzVOluvhCYDPyyvL8H8Dtg10YFJUmSpOaqkiSunJlr1ty/OSLubVRAkiRJar4qs5vvKWc0AxAR/wP8s3EhSZIkqdnabUmMiPspxiD2A/aOiGfK+8sDD86a8CRJktQMHXU3bz/LopAkSVKv0m6SmJlPz8pAJEmS1HtUGZMoSZKkOYxJoiRJkup0miRGxPwRMVd5e3BE7BAR/RofmiRJkpqlSkvibUD/iFgauBHYFzi/kUFJkiSpuaokiZGZ7wI7A7/MzJ2A1RobliRJkpqpUpIYERsAI4Bry2VVfqlFkiRJH1NVksQjgGOAP2bm+IhYAbi5oVFJkiSpqTptEczMW4Fba+4/CRzWyKAkSZLUXB39LN8vMvOIiLiG4uf4ZpCZOzQ0MkmSJDVNRy2Jvyv/nzorApEkSVLv0dHP8o0t/9/a3jaSJEmaPfmLK5IkSapjkihJkqQ6HSaJEdEnIk6ZVcFIkiSpd+gwSczMacA6ERFdLTgizo2IlyPigZplp0TEwxFxX0T8MSIWKZcPjIj3ImJc+Teqq/VJkiSp51Tpbr4H+HNE7BURO7f8VdjvfODzrZb9HfhMZg4BHqW4SHeLJzJzaPl3cJXgJUmS1BhVfl5vMeA14HM1yxK4qqOdMvO2iBjYatnomrt3ArtUC1OSJEmzUpVfXNm3QXXvB1xWc39QRNwDvAV8LzNvb2uniDgQOBBgueWWa1BokiRJc7ZOu5sjYnBE3NgytjAihkTE97pTaUR8F5gKXFQuegFYLjPXAr4BXBwRC7W1b2aenZnDMnPYgAEDuhOGJEmS2lFlTOI5FGMHPwDIzPuA3We2wojYB9geGJGZWZY5JTNfK2+PBZ4ABs9sHZIkSeqeKknifJl5V6tlU2emsoj4PHAUsENmvluzfEBE9ClvrwCsBDw5M3VIkiSp+6pMXHk1IlakmKxCROxC0T3coYi4BBgOLBERE4EfUrRIzgP8vbyqzp3lTOZNgB9FxFRgGnBwZr7e9YcjSZKknlAlSTwEOBtYJSKeA54CRnS2U2bu0cbi37az7ZXAlRVikSRJ0ixQZXbzk8AWETE/MFdmTm58WJIkSWqmKrObn4iIi4C9gGUbH5IkSZKarcrEldWA3wCLA6dGxJMR8cfGhiVJkqRmqpIkTqO4/M004EPgJeDlRgYlSZKk5qoyceUt4H7gNOCclusZSpIkafZVpSVxD+A24GvApRFxfERs3tiwJEmS1ExVZjf/GfhzRKwCbAMcAXwHmLexoUmSJKlZqsxuvjIingBOBxYA9gYWbXRgkiRJap4qYxJPAu7OzGmNDkaSJEm9Q5UkcRxwSERsUt6/FRiVmR80LCpJkiQ1VZUk8SygH/Dr8v5e5bL9GxWUJEmSmqtKkrhuZq5Zc/+miLi3UQFJkiSp+SpdTDsiVmy5ExErUFxYW5IkSbOpKi2J3wZujogngQCWB/ZtaFSSJElqqirXSbwxIlYCVqZIEh/OzCkNj0ySJElN026SGBE7t7NqxYggM69qUEySJElqso5aEr/QwboETBIlSZJmU+0miZnpuENJkqQ5VJXZzZIkSZrDmCRKkiSpjkmiJEmS6nQ5SYyIYRGxdCOCkSRJUu8wMy2JXwf+EhGX9XQwkiRJ6h2q/OLKDDJzH4CIWLDnw5EkSVJv0GlLYkRsGBHzl7f3jIjTImL5zJzc+PAkSZLUDFW6m88C3o2INYHvAE8DFzY0KkmSJDVVlSRxamYm8EXg9Mw8HbCrWZIkaTZWZUzi5Ig4BtgT2CQi+gD9GhuWJEmSmqlKS+JuwBTgq5n5IrA0cEpDo5IkSVJTddqSWCaGp9XcfwbHJEqSJM3W2k0SI2IykO2tz8yFGhKRJEmSmq7dJDEzFwSIiB8BLwK/AwIYgRNXJEmSZmtVxiRunZm/zszJmflWZp4FfKnRgUmSJKl5qiSJ0yJiRET0iYi5ImIEMK3RgUmSJKl5qiSJXwG+DLxU/u1aLpMkSdJsqsrs5gkUF9KWJEnSHKLTJDEiBgAHAANrt8/M/RoXliRJkpqpyi+u/Bm4HbgBxyJKkiTNEaokifNl5lENj0SSJEm9RpWJK3+JiG0bHokkSZJ6jSpJ4uEUieL7ETG5/Hurs50i4tyIeDkiHqhZtlhE/D0iHiv/L1qz7piIeDwiHomIrWfu4UiSJKkndJokZuaCmTlXZvYvby9Y8Sf5zgc+32rZ0cCNmbkScGN5n4hYDdgdWL3c59cR0acLj0OSJEk9qEpLIhGxQ0ScWv5tX2WfzLwNeL3V4i8CF5S3LwB2rFl+aWZOycyngMeB9arUI0mSpJ7XaZIYESdRdDk/WP4dXi6bGZ/MzBcAyv+fKJcvDTxbs93Ecllb8RwYEWMiYswrr7wyk2FIkiSpI1VmN28LDM3MDwEi4gLgHsqu4h4SbSzLtjbMzLOBswGGDRvW5jaSJEnqnkrdzcAiNbcX7kZ9L0XEkgDl/5fL5ROBZWu2WwZ4vhv1SJIkqRuqJIk/Ae6JiPPLVsSxwI9nsr6rgX3K2/tQXKi7ZfnuETFPRAwCVgLumsk6JEmS1E1Vfrv5koi4BViXolv4qMx8sbP9IuISYDiwRERMBH4InARcHhFfBZ4Bdi3rGB8Rl1OMeZwKHJKZ/rqLJElSk1T57eadgJsy8+ry/iIRsWNm/qmj/TJzj3ZWbd7O9icCJ3YWjyRJkhqvSnfzDzPzzZY7mTmJolVQkiRJs6kqSWJb21SZFS1JkqSPqSpJ4piIOC0iVoyIFSLi5xSTVyRJkjSbqpIkfh34L3AZcDnwHnBII4OSJElSc1WZ3fwOcHRELJCZb8+CmCRJktRkVX6W77MR0fKTfETEmhHx64ZHJkmSpKap0t38c2Br4DWAzLwX2KSRQUmSJKm5Kv0sX2Y+22qRF7qWJEmajVW5lM2zEfFZICNibuAw4KHGhiVJkqRmqtKSeDDFbOalgYnAUJzdLEmSNFurMrv5VWDELIhFkiRJvUSV2c0nR8RCEdEvIm6MiFcjYs9ZEZwkSZKao0p381aZ+RawPUV382Dg2w2NSpIkSU1VJUnsV/7fFrgkM19vYDySJEnqBarMbr4mIh6m+Dm+r0XEAOD9xoYlSZKkZuq0JTEzjwY2AIZl5gfAu8AXGx2YJEmSmqfdJDEiNmq5nZlvZOa08vY7mfliOZnlM7MiSEmSJM1aHXU3fykiTgauB8YCrwD9gU8DmwHLA99seISSJEma5dpNEjPzyIhYFNgF2BVYkmJc4kPAbzLzH7MmREmSJM1qHU5cycw3gHPKP0mSJM0hqlwCR5IkSXMYk0RJkiTVMUmUJElSnSq/3TxfRHw/Is4p768UEds3PjRJkiQ1S5WWxPOAKRQX1Ibi95tPaFhEkiRJaroqSeKKmXky8AFAZr4HREOjkiRJUlNVSRL/GxHzAgkQEStStCxKkiRpNtXhdRJLP6T41ZVlI+IiYENgZCODkiRJUnN1miRm5t8j4m5gfYpu5sMz89WGRyZJkqSmqXoJnKWBPsDcwCYRsXPjQpIkSVKzddqSGBHnAkOA8cCH5eIErmpgXJIkSWqiKmMS18/M1RoeiSRJknqNKt3Nd0SESaIkSdIcpEpL4gUUieKLFJe+CSAzc0hDI5MkSVLTVEkSzwX2Au7nozGJkiRJmo1VSRKfycyrGx6JJEmSeo0qSeLDEXExcA01v7SSmc5uliRJmk1VSRLnpUgOt6pZ5iVwJEmSZmNVfnFl31kRiCRJknqPdpPEiPhOZp4cEb+kaDmcQWYeNjMVRsTKwGU1i1YAfgAsAhwAvFIuPzYzr5uZOiRJktQ9HbUkPlT+H9OTFWbmI8BQgIjoAzwH/BHYF/h5Zp7ak/VJkiSp69pNEjPzmvLmu5n5h9p1EbFrD9W/OfBEZj4dET1UpCRJkrqryi+uHFNx2czYHbik5v6hEXFfRJwbEYv2UB2SJEnqoo7GJG4DbAssHRFn1KxaCJja3YojYm5gBz5KOM8C/o9i/OP/AT8D9mtjvwOBAwGWW2657oYhSZKkNnTUkvg8xXjE94GxNX9XA1v3QN3bAHdn5ksAmflSZk7LzA+Bc4D12topM8/OzGGZOWzAgAE9EIYkSZJa62hM4r3AvRFxcWZ+0IC696CmqzkilszMF8q7OwEPNKBOSZIkVVDlOok9niBGxHzAlsBBNYtPjoihFN3NE1qtkyRJ0ixU5RdXelxmvgss3mrZXs2IRZIkSfWqzG6WJEnSHKbTlsSIGAx8G1i+dvvM/FwD45IkSVITVelu/gMwimLG8bTGhiNJkqTeoEqSODUzz2p4JJIkSeo1OrqY9mLlzWsi4msUv688pWV9Zr7e4NgkSZLUJB21JI6luBxNy48qf7tmXQIrNCooSZIkNVdHF9MeNCsDkSRJUu/R6SVwIuKQiFik5v6iZfezJEmSZlNVrpN4QGZOarmTmW8ABzQsIkmSJDVdlSRxrohoGZdIRPQB5m5cSJIkSWq2KpfA+RtweUSMopiwcjBwfUOjkiRJUlNVSRKPAg4C/pdipvNo4P81MihJkiQ1V6dJYmZ+GBG/Bf5B0ZL4SGb6yyuSJEmzsSq/3TwcuACYQNGSuGxE7JOZtzU0MkmSJDVNle7mnwFbZeYjABExGLgEWKeRgUmSJKl5qsxu7teSIAJk5qNAv8aFJEmSpGar0pI4phyT+Lvy/giKn+yTJEnSbKpKkvi/wCHAYRRjEm8Dft3IoCRJktRcVWY3T4mIXwE3Ah9SzG7+b8MjkyRJUtNUmd28HTAKeIKiJXFQRByUmX9tdHCSJElqjqqzmzfLzMcBImJF4FrAJFGSJGk2VWV288stCWLpSeDlBsUjSZKkXqBKS+L4iLgOuJziF1d2Bf4TETsDZOZVDYxPkiRJTVAlSewPvARsWt5/BVgM+AJF0miSKEmSNJupMrt531kRiCRJknqPTsckRsTgiLgxIh4o7w+JiO81PjRJkiQ1S5WJK+cAxwAfAGTmfcDujQxKkiRJzVUlSZwvM+9qtWxqI4KRJElS71AlSXy1vDZiAkTELsALDY1KkiRJTVVldvMhwNnAKhHxHPAUMKKhUUmSJKmpqsxufhLYIiLmB+bKzMmND0uSJEnNVKUlEYDMfKeRgUiSJKn3qDImUZIkSXMYk0RJkiTVqdTdHBGfBQbWbp+ZFzYoJkmSJDVZp0liRPwOWBEYB0wrFydgkihJkjSbqtKSOAxYLTOz0cFIkiSpd6gyJvEB4FONDkSSJEm9R5WWxCWAByPiLmBKy8LM3KFhUUmSJKmpqiSJxzU6CEmSJPUuVX5x5daerjQiJgCTKSbCTM3MYRGxGHAZxSzqCcCXM/ONnq5bkiRJnet0TGJErB8R/4mItyPivxExLSLe6oG6N8vMoZk5rLx/NHBjZq4E3FjelyRJUhNUmbjyK2AP4DFgXmD/cllP+yJwQXn7AmDHBtQhSZKkCir94kpmPg70ycxpmXkeMLyb9SYwOiLGRsSB5bJPZuYLZX0vAJ/oZh2SJEmaSVUmrrwbEXMD4yLiZOAFYP5u1rthZj4fEZ8A/h4RD1fdsUwqDwRYbrnluhmGJEmS2lKlJXGvcrtDgXeAZYEvdafSzHy+/P8y8EdgPeCliFgSoPz/cjv7np2ZwzJz2IABA7oThiRJktrRaZKYmU8DASyZmcdn5jfK7ueZEhHzR8SCLbeBrSgu2H01sE+52T7An2e2DkmSJHVPldnNX6D43ebry/tDI+LqbtT5SeAfEXEvcBdwbWZeD5wEbBkRjwFblvclSZLUBFUvpr0ecAtAZo6LiIEzW2FmPgms2cby14DNZ7ZcSZIk9ZwqYxKnZuabDY9EkiRJvUaVlsQHIuIrQJ+IWAk4DPhXY8OSJElSM1VpSfw6sDowBbgEeAs4ooExSZIkqcmq/Hbzu8B3yz9JkiTNAdpNEjubwZyZO/R8OJIkSeoNOmpJ3AB4lqKL+d8U10qUJEnSHKCjJPFTFNcr3AP4CnAtcElmjp8VgUmSJKl52p24kpnTMvP6zNwHWB94HLglIr4+y6KTJElSU3Q4cSUi5gG2o2hNHAicAVzV+LAkSZLUTB1NXLkA+AzwV+D4zHxglkUlSZKkpuqoJXEv4B1gMHBYxPR5KwFkZi7U4NgkSZLUJO0miZlZ5ULbkiRJmg2ZCEqSJKmOSaIkSZLqmCRKkiSpjkmiJEmS6pgkSpIkqY5JoiRJkuqYJEqSJKmOSaIkSZLqmCRKkiSpjkmiJEmS6pgkSpIkqY5JoiRJkuqYJEqSJKmOSaIkSZLqmCRKkiSpjkmiJEmS6pgkSpIkqY5JoiRJkuqYJEqSJKmOSaIkSZLqmCRKkiSpjkmiJEmS6pgkSpIkqY5JoiRJkuqYJEqSJKmOSaIkSZLqmCRKkiSpjkmiJEmS6szyJDEilo2ImyPioYgYHxGHl8uPi4jnImJc+bftrI5NkiRJhb5NqHMq8M3MvDsiFgTGRsTfy3U/z8xTmxCTJEmSaszyJDEzXwBeKG9PjoiHgKVndRySJElqX1PHJEbEQGAt4N/lokMj4r6IODciFm1eZJIkSXO2piWJEbEAcCVwRGa+BZwFrAgMpWhp/Fk7+x0YEWMiYswrr7wyq8KVJEmaozQlSYyIfhQJ4kWZeRVAZr6UmdMy80PgHGC9tvbNzLMzc1hmDhswYMCsC1qSJGkO0ozZzQH8FngoM0+rWb5kzWY7AQ/M6tgkSZJUaMbs5g2BvYD7I2JcuexYYI+IGAokMAE4qAmxSZIkiebMbv4HEG2sum5WxyJJkqS2+YsrkiRJqmOSKEmSpDomiZIkSapjkihJkqQ6JomSJEmqY5IoSZKkOiaJkiRJqmOSKEmSpDomiZIkSapjkihJkqQ6JomSJEmqY5IoSZKkOiaJkiRJqmOSKEmSpDomiZIkSapjkihJkqQ6JomSJEmqY5IoSZKkOn2bHYAkqXcZePS1zQ5hjjWh/1eaHcKc67g3mx1Br2NLoiRJkuqYJEqSJKmOSaIkSZLqmCRKkiSpjkmiJEmS6pgkSpIkqY5JoiRJkuqYJEqSJKmOSaIkSZLqmCRKkiSpjkmiJEmS6pgkSpIkqY5JoiRJkuqYJEqSJKmOSaIkSZLqmCRKkiSpjkmiJEmS6pgkSpIkqY5JoiRJkuqYJEqSJKmOSaIkSZLq9LokMSI+HxGPRMTjEXF0s+ORJEmaE/WqJDEi+gBnAtsAqwF7RMRqzY1KkiRpztOrkkRgPeDxzHwyM/8LXAp8sckxSZIkzXF6W5K4NPBszf2J5TJJkiTNQn2bHUAr0caynGGDiAOBA8u7b0fEIw2PSk0TsATwarPjmCMd39bbUVIjec5rollzzlt+VlTSU3pbkjgRWLbm/jLA87UbZObZwNmzMig1T0SMycxhzY5DkmYFz3nqTXpbd/N/gJUiYlBEzA3sDlzd5JgkSZLmOL2qJTEzp0bEocDfgD7AuZk5vslhSZIkzXF6VZIIkJnXAdc1Ow71Gg4tkDQn8ZynXiMys/OtJEmSNEfpbWMSJUmS1Av0uu5m9V4RsThwY3n3U8A04JXy/nrlBdDb23cYsHdmHtaF+iYAk8t6AG7ryv4Vyn87MxfoqfIkzZ66c+4r9x8O/Dcz/9XGupHAKcBzNYu/kpkPdi/q6eUfB7ydmaf2RHmas5gkqrLMfA0YCm2feCKib2ZObWffMcCYmah2s8z0mmGSmqazc18Fw4G3gboksXRZZh7ajRClhrC7Wd0SEedHxGkRcTPw04hYLyL+FRH3lP9XLrcbHhF/KW8fFxHnRsQtEfFkRHSpdbDc7xdl+Q9ExHrl8sUi4k8RcV9E3BkRQ8rlC0TEeRFxf7nuSzVlnRgR95bbf7LHnhhJs7WIWCcibo2IsRHxt4hYslx+WEQ8WJ5rLo2IgcDBwJERMS4iNq5Y/vCIuC0i/liWNyoi5irX7VGezx6IiJ/W7PP5iLi7PKfdWFPcajN7vtWczZZE9YTBwBaZOS0iFgI2KS9ntAXwY+BLbeyzCrAZsCDwSESclZkftLHdzRHR0t18QWb+vLw9f2Z+NiI2Ac4FPgMcD9yTmTtGxOeACym+/X8feDMz1wCIiEVbygDuzMzvRsTJwAHACd15IiTNEQL4JfDFzHwlInYDTgT2A44GBmXmlIhYJDMnRcQoOm593C0iNqq5v0H5fz1gNeBp4Hpg54j4F/BTYB3gDWB0ROwI/BM4h+L8+1RELFZTXtXzrTQDk0T1hD9kZksitzBwQUSsRPGTiv3a2efazJwCTImIl4FPUvziTmvtdTdfApCZt0XEQhGxCLARZUKamTdFxOIRsTCwBcWF2SnXvVHe/C/wl/L2WGDLSo9W0pxuHoovpn+PCCiu6/tCue4+4KKI+BPwp4rl1XU3l+XelZlPlvcvoTjHfQDckpmvlMsvAjahGCd5W2Y+BZCZr9cUV/V8K83AJFE94Z2a2/8H3JyZO5XdLLe0s8+UmtvT6Pqx2PraTUn7v/0dbWwP8EF+dA2omYlB0pwpgPGZuUEb67ajSNp2AL4fEat3o56q57mWmNq7pl13z7eaQzkmUT1tYT6apTeygfXsBlB20byZmW8CtwEjyuXDgVcz8y1gNDD9W3pNd7MkzYwpwICI2AAgIvpFxOrlmMFlM/Nm4DvAIsACFFdpWHAm6lmv/JnauSjOef8A/g1sGhFLREQfYA/gVuCOcvmgMqbF2itUqsokUT3tZOAnEfFPii6Y7rq5HOw9LiIurFn+Rjk2ZxTw1XLZccCwiLgPOAnYp1x+ArBoOcj7XoqxOZI0sz4EdqGYrHcvMA74LMU57/cRcT9wD/DzzJwEXAPs1MHEld1qznPjIuKz5fI7KM5lDwBPAX/MzBeAY4CbgXuBuzPzz2X384HAVWVMlzXkkWuO4i+u6GMnIm4BvlVeVkeSZjtlb8i3MnP7JoeiOZgtiZIkSapjS6IkSZLq2JIoSZKkOiaJkiRJqmOSKEmSpDomiZIkSapjkihJkqQ6JomSJEmq8/8BJrmdApZexgMAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 720x504 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Create directory for saving figures\n",
    "import os\n",
    "dir_to_save_figures_in = \"pytorch_2_results/figures/\" \n",
    "os.makedirs(dir_to_save_figures_in, exist_ok=True)\n",
    "\n",
    "# Create a save path for the single run results\n",
    "save_path_multi_run = f\"{dir_to_save_figures_in}single_run_{GPU_NAME}_{MODEL_NAME}_{DATASET_NAME}_{IMAGE_SIZE}_train_epoch_time.png\"\n",
    "print(f\"[INFO] Save path for single run results: {save_path_multi_run}\")\n",
    "\n",
    "# Plot the results and save the figures\n",
    "plot_mean_epoch_times(non_compiled_results=single_run_no_compile_results_df, \n",
    "                      compiled_results=single_run_compile_results_df, \n",
    "                      multi_runs=False, \n",
    "                      save_path=save_path_multi_run, \n",
    "                      save=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Hmm... what's happening here?\n",
    "\n",
    "It looks like the model with `torch.compile()` took *longer* than the model without it (on an A100, this is the case, but on my local NVIDIA TITAN RTX, the compiled model is slightly faster).\n",
    "\n",
    "Why might this be the case?\n",
    "\n",
    "Well on a per epoch time we can see that although experiment 2 (with `torch.compile()`) was far slower for the first epoch, it started being faster than experiment 1 (without `torch.compile()`) for subsequent epochs.\n",
    "\n",
    "This is because behind the scenes `torch.compile()` spends the first steps of a training run \"warming up\" the model and performing optimization steps behind the scenes.\n",
    "\n",
    "These **optimization steps take time up front** but mean **subsequent steps should be faster**.\n",
    "\n",
    "To test if this is true, you could try training the model above for longer (say 50 epochs rather than 5) and see what the average training times come out to be."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.4 Save single run results to file with GPU details\n",
    "\n",
    "We can save the raw data of our results to file too by exporting the dataframes as CSVs.\n",
    "\n",
    "We'll first create a directory for storing results.\n",
    "\n",
    "Then we'll create filepaths to save each of the target dataframes to before exporting them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] Saving non-compiled experiment 1 results to: pytorch_2_results/single_run_results/single_run_non_compiled_results_CIFAR10_ResNet50_NVIDIA_TITAN_RTX.csv\n",
      "[INFO] Saving compiled experiment 2 results to: pytorch_2_results/single_run_results/single_run_compiled_results_CIFAR10_ResNet50_NVIDIA_TITAN_RTX.csv\n"
     ]
    }
   ],
   "source": [
    "# Make a directory for single_run results\n",
    "import os\n",
    "pytorch_2_results_dir = \"pytorch_2_results\"\n",
    "pytorch_2_single_run_results_dir = f\"{pytorch_2_results_dir}/single_run_results\"\n",
    "os.makedirs(pytorch_2_single_run_results_dir, exist_ok=True)\n",
    "\n",
    "# Create filenames for each of the dataframes\n",
    "save_name_for_non_compiled_results = f\"single_run_non_compiled_results_{DATASET_NAME}_{MODEL_NAME}_{GPU_NAME}.csv\"\n",
    "save_name_for_compiled_results = f\"single_run_compiled_results_{DATASET_NAME}_{MODEL_NAME}_{GPU_NAME}.csv\"\n",
    "\n",
    "# Create filepaths to save the results to\n",
    "single_run_no_compile_save_path = f\"{pytorch_2_single_run_results_dir}/{save_name_for_non_compiled_results}\"\n",
    "single_run_compile_save_path = f\"{pytorch_2_single_run_results_dir}/{save_name_for_compiled_results}\"\n",
    "print(f\"[INFO] Saving non-compiled experiment 1 results to: {single_run_no_compile_save_path}\")\n",
    "print(f\"[INFO] Saving compiled experiment 2 results to: {single_run_compile_save_path}\")\n",
    "\n",
    "# Save the results\n",
    "single_run_no_compile_results_df.to_csv(single_run_no_compile_save_path)\n",
    "single_run_compile_results_df.to_csv(single_run_compile_save_path)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Time models across multiple runs\n",
    "\n",
    "Now we've tested our model with a single run with `torch.compile()` on and off, let's do the same for multiple runs.\n",
    "\n",
    "We're going to start by creating three functions for experiments 3 and 4.\n",
    "\n",
    "1. **Experiment 3:** `create_and_train_non_compiled_model()` - this function will be similar to the workflow we've used for the single runs. We'll put the model creation (via `create_model()`) and training in a single function so we can call it multiple times (for multiple runs) and measure the time of each run.\n",
    "2. **Experiment 4:** `create_compiled_model()` - this function will be similar to the `create_model()` function above, however, it will create a normal PyTorch model and then call `torch.compile()` on it and return it.\n",
    "3. **Experiment 4:** `train_compiled_model()` - this function will take in a compiled model and train it in the same way we've been training our models for single runs.\n",
    "\n",
    "Why separate functions 2 and 3 (`create_compiled_model()` and `train_compiled_model()`) for experiment 4?\n",
    "\n",
    "Because calling `torch.compile()` on model means that for the first few runs, the model will be \"warming up\" as PyTorch calculates a bunch of optimization steps behind the scenes.\n",
    "\n",
    "So in practice, you'll generally want to compile up front *once* and then train/perform inference with an already compiled model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_and_train_non_compiled_model(epochs=NUM_EPOCHS, \n",
    "                                        learning_rate=LEARNING_RATE, \n",
    "                                        disable_progress_bar=False):\n",
    "    \"\"\"\n",
    "    Create and train a non-compiled PyTorch model.\n",
    "    \"\"\"\n",
    "    model, _ = create_model()\n",
    "    model.to(device)\n",
    "\n",
    "    loss_fn = torch.nn.CrossEntropyLoss()\n",
    "    optimizer = torch.optim.Adam(model.parameters(),\n",
    "                                 lr=learning_rate)\n",
    "\n",
    "    results = train(model=model,\n",
    "                    train_dataloader=train_dataloader,\n",
    "                    test_dataloader=test_dataloader,\n",
    "                    loss_fn=loss_fn,\n",
    "                    optimizer=optimizer,\n",
    "                    epochs=epochs,\n",
    "                    device=device,\n",
    "                    disable_progress_bar=disable_progress_bar)\n",
    "    return results\n",
    "\n",
    "def create_compiled_model():\n",
    "    \"\"\"\n",
    "    Create a compiled PyTorch model and return it.\n",
    "    \"\"\"\n",
    "    model, _ = create_model()\n",
    "    model.to(device)\n",
    "    \n",
    "    compile_start_time = time.time()\n",
    "    ### New in PyTorch 2.x ###\n",
    "    compiled_model = torch.compile(model)\n",
    "    ##########################\n",
    "    compile_end_time = time.time()\n",
    "\n",
    "    compile_time = compile_end_time - compile_start_time\n",
    "\n",
    "    print(f\"Time to compile: {compile_time} | Note: The first time you compile your model, the first few epochs will be slower than subsequent runs.\")\n",
    "    return compiled_model\n",
    "\n",
    "def train_compiled_model(model=compiled_model, \n",
    "                         epochs=NUM_EPOCHS, \n",
    "                         learning_rate=LEARNING_RATE,\n",
    "                         disable_progress_bar=False):\n",
    "    \"\"\"\n",
    "    Train a compiled model and return the results.\n",
    "    \"\"\"\n",
    "    loss_fn = torch.nn.CrossEntropyLoss()\n",
    "    optimizer = torch.optim.Adam(compiled_model.parameters(),\n",
    "                                 lr=learning_rate)\n",
    "    \n",
    "    compile_results = train(model=model,\n",
    "                            train_dataloader=train_dataloader,\n",
    "                            test_dataloader=test_dataloader,\n",
    "                            loss_fn=loss_fn,\n",
    "                            optimizer=optimizer,\n",
    "                            epochs=epochs,\n",
    "                            device=device,\n",
    "                            disable_progress_bar=disable_progress_bar)\n",
    "    \n",
    "    return compile_results"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4.1 Experiment 3 - Multiple runs, no compile\n",
    "\n",
    "Functions ready for experiment 3 and 4! \n",
    "\n",
    "Let's start with experiment 3.\n",
    "\n",
    "| **Experiment** | **Model** | **Data** | **Epochs** | **Batch size** | **Image size** | **`torch.compile()`** |  \n",
    "|----- |-----| -----| -----| -----| -----| -----|\n",
    "| 3 (multi-run) | ResNet50 | CIFAR10 | 3x5 | 128 | 224 | No |\n",
    "\n",
    "We'll set the number of runs to 3 and the number of epochs to 5.\n",
    "\n",
    "We'll create an empty list to store the results and append the results of each run to it after each run.\n",
    "\n",
    "> **Note:** Running the following code can take quite a while depending on the speed of your GPU, for me, it took 20 minutes on a NVIDIA A100 on Google Colab Pro and around 49 minutes on a NVIDIA TITAN RTX."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4108ca563be14d4a8ed57c3de1697ebd",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/3 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] Run 1 of 3 for non-compiled model\n",
      "Epoch: 1 | train_loss: 0.8242 | train_acc: 0.7136 | test_loss: 0.5486 | test_acc: 0.8124 | train_epoch_time: 185.1112 | test_epoch_time: 12.9925\n",
      "Epoch: 2 | train_loss: 0.4415 | train_acc: 0.8479 | test_loss: 0.6415 | test_acc: 0.7829 | train_epoch_time: 185.0138 | test_epoch_time: 12.9690\n",
      "Epoch: 3 | train_loss: 0.3229 | train_acc: 0.8882 | test_loss: 0.4486 | test_acc: 0.8488 | train_epoch_time: 185.0366 | test_epoch_time: 12.9433\n",
      "Epoch: 4 | train_loss: 0.2433 | train_acc: 0.9151 | test_loss: 0.4376 | test_acc: 0.8596 | train_epoch_time: 185.0900 | test_epoch_time: 12.9465\n",
      "Epoch: 5 | train_loss: 0.1785 | train_acc: 0.9379 | test_loss: 0.4305 | test_acc: 0.8641 | train_epoch_time: 185.0405 | test_epoch_time: 13.0102\n",
      "[INFO] Run 2 of 3 for non-compiled model\n",
      "Epoch: 1 | train_loss: 0.8304 | train_acc: 0.7101 | test_loss: 0.6132 | test_acc: 0.7884 | train_epoch_time: 185.0911 | test_epoch_time: 13.0429\n",
      "Epoch: 2 | train_loss: 0.4602 | train_acc: 0.8411 | test_loss: 0.6183 | test_acc: 0.7907 | train_epoch_time: 185.0738 | test_epoch_time: 12.9596\n",
      "Epoch: 3 | train_loss: 0.3283 | train_acc: 0.8869 | test_loss: 0.4309 | test_acc: 0.8534 | train_epoch_time: 185.0462 | test_epoch_time: 12.9877\n",
      "Epoch: 4 | train_loss: 0.2474 | train_acc: 0.9140 | test_loss: 0.4525 | test_acc: 0.8565 | train_epoch_time: 184.9521 | test_epoch_time: 12.9942\n",
      "Epoch: 5 | train_loss: 0.1860 | train_acc: 0.9360 | test_loss: 0.6284 | test_acc: 0.8195 | train_epoch_time: 184.9911 | test_epoch_time: 12.9369\n",
      "[INFO] Run 3 of 3 for non-compiled model\n",
      "Epoch: 1 | train_loss: 0.7915 | train_acc: 0.7246 | test_loss: 0.6102 | test_acc: 0.7894 | train_epoch_time: 184.9795 | test_epoch_time: 13.0175\n",
      "Epoch: 2 | train_loss: 0.4394 | train_acc: 0.8477 | test_loss: 0.5958 | test_acc: 0.7968 | train_epoch_time: 184.9266 | test_epoch_time: 12.9909\n",
      "Epoch: 3 | train_loss: 0.3156 | train_acc: 0.8893 | test_loss: 0.4299 | test_acc: 0.8547 | train_epoch_time: 185.1226 | test_epoch_time: 12.9396\n",
      "Epoch: 4 | train_loss: 0.2371 | train_acc: 0.9163 | test_loss: 0.4185 | test_acc: 0.8608 | train_epoch_time: 184.9447 | test_epoch_time: 12.9673\n",
      "Epoch: 5 | train_loss: 0.1739 | train_acc: 0.9389 | test_loss: 0.3797 | test_acc: 0.8805 | train_epoch_time: 184.9552 | test_epoch_time: 13.0328\n"
     ]
    }
   ],
   "source": [
    "# Run non-compiled model for multiple runs\n",
    "NUM_RUNS = 3\n",
    "NUM_EPOCHS = 5\n",
    "\n",
    "# Create an empty list to store multiple run results\n",
    "non_compile_results_multiple_runs = []\n",
    "\n",
    "# Run non-compiled model for multiple runs\n",
    "for i in tqdm(range(NUM_RUNS)):\n",
    "    print(f\"[INFO] Run {i+1} of {NUM_RUNS} for non-compiled model\")\n",
    "    results = create_and_train_non_compiled_model(epochs=NUM_EPOCHS, disable_progress_bar=True)\n",
    "    non_compile_results_multiple_runs.append(results)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we've got a list of results from experiment 3, let's iterate through them and create a dataframe containing all of the results.\n",
    "\n",
    "We'll then average the results across the 3 runs by grouping by the epoch number (the index of the dataframe) and taking the mean of the results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>train_loss</th>\n",
       "      <th>train_acc</th>\n",
       "      <th>test_loss</th>\n",
       "      <th>test_acc</th>\n",
       "      <th>train_epoch_time</th>\n",
       "      <th>test_epoch_time</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.815352</td>\n",
       "      <td>0.716103</td>\n",
       "      <td>0.590690</td>\n",
       "      <td>0.796710</td>\n",
       "      <td>185.060622</td>\n",
       "      <td>13.017663</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.447013</td>\n",
       "      <td>0.845567</td>\n",
       "      <td>0.618526</td>\n",
       "      <td>0.790150</td>\n",
       "      <td>185.004740</td>\n",
       "      <td>12.973144</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.322255</td>\n",
       "      <td>0.888117</td>\n",
       "      <td>0.436471</td>\n",
       "      <td>0.852321</td>\n",
       "      <td>185.068499</td>\n",
       "      <td>12.956863</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.242587</td>\n",
       "      <td>0.915120</td>\n",
       "      <td>0.436207</td>\n",
       "      <td>0.858946</td>\n",
       "      <td>184.995601</td>\n",
       "      <td>12.969341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.179439</td>\n",
       "      <td>0.937612</td>\n",
       "      <td>0.479547</td>\n",
       "      <td>0.854727</td>\n",
       "      <td>184.995575</td>\n",
       "      <td>12.993280</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   train_loss  train_acc  test_loss  test_acc  train_epoch_time  \\\n",
       "0    0.815352   0.716103   0.590690  0.796710        185.060622   \n",
       "1    0.447013   0.845567   0.618526  0.790150        185.004740   \n",
       "2    0.322255   0.888117   0.436471  0.852321        185.068499   \n",
       "3    0.242587   0.915120   0.436207  0.858946        184.995601   \n",
       "4    0.179439   0.937612   0.479547  0.854727        184.995575   \n",
       "\n",
       "   test_epoch_time  \n",
       "0        13.017663  \n",
       "1        12.973144  \n",
       "2        12.956863  \n",
       "3        12.969341  \n",
       "4        12.993280  "
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Go through non_compile_results_multiple_runs and create a dataframe for each run then concatenate them together\n",
    "non_compile_results_dfs = []\n",
    "for result in non_compile_results_multiple_runs:\n",
    "    result_df = pd.DataFrame(result)\n",
    "    non_compile_results_dfs.append(result_df)\n",
    "non_compile_results_multiple_runs_df = pd.concat(non_compile_results_dfs)\n",
    "\n",
    "# Get the averages across the multiple runs\n",
    "non_compile_results_multiple_runs_df = non_compile_results_multiple_runs_df.groupby(non_compile_results_multiple_runs_df.index).mean()\n",
    "non_compile_results_multiple_runs_df"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wonderful!\n",
    "\n",
    "We can inspect these later, let's move onto experiment 4."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4.2 Experiment 4 - Multiple runs, with compile\n",
    "\n",
    "Time for experiment 4.\n",
    "\n",
    "Running a compiled model for multiple runs.\n",
    "\n",
    "| **Experiment** | **Model** | **Data** | **Epochs** | **Batch size** | **Image size** | **`torch.compile()`** |  \n",
    "|----- |-----| -----| -----| -----| -----| -----|\n",
    "| 4 (multi-run) | ResNet50 | CIFAR10 | 3x5 | 128 | 224 | Yes |\n",
    "\n",
    "We can do this by using the `create_compiled_model()` and `train_compiled_model()` functions we created earlier.\n",
    "\n",
    "We'll start by creating the compiled model *first* and then training it for 3 runs.\n",
    "\n",
    "We're not worried about the results of the model (loss and accuracy) as much as how long it takes.\n",
    "\n",
    "The reason why we compile it once at the start is that PyTorch only needs to run the optimization steps once (this can take some time) and then it can reuse them for the rest of the runs.\n",
    "\n",
    "We'll also create an empty list just like before to store our model's results over a series of runs. \n",
    "\n",
    "> **Note:** Running the following code can take quite a while depending on the speed of your GPU, for me, it took 18 minutes on a NVIDIA A100 on Google Colab Pro and around 45 minutes on a NVIDIA TITAN RTX."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time to compile: 0.001275777816772461 | Note: The first time you compile your model, the first few epochs will be slower than subsequent runs.\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "f3338920367645c1b013ba2882dfc2d8",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/3 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] Run 1 of 3 for compiled model\n",
      "Epoch: 1 | train_loss: 0.8026 | train_acc: 0.7192 | test_loss: 0.6995 | test_acc: 0.7650 | train_epoch_time: 194.3336 | test_epoch_time: 20.6106\n",
      "Epoch: 2 | train_loss: 0.4440 | train_acc: 0.8483 | test_loss: 0.5565 | test_acc: 0.8089 | train_epoch_time: 169.3882 | test_epoch_time: 10.8076\n",
      "Epoch: 3 | train_loss: 0.3208 | train_acc: 0.8896 | test_loss: 0.4164 | test_acc: 0.8620 | train_epoch_time: 169.9283 | test_epoch_time: 10.8361\n",
      "Epoch: 4 | train_loss: 0.2329 | train_acc: 0.9197 | test_loss: 0.3635 | test_acc: 0.8792 | train_epoch_time: 169.8744 | test_epoch_time: 10.9050\n",
      "Epoch: 5 | train_loss: 0.1803 | train_acc: 0.9369 | test_loss: 0.4387 | test_acc: 0.8587 | train_epoch_time: 169.6391 | test_epoch_time: 10.8240\n",
      "[INFO] Run 2 of 3 for compiled model\n",
      "Epoch: 1 | train_loss: 0.1875 | train_acc: 0.9347 | test_loss: 0.4187 | test_acc: 0.8714 | train_epoch_time: 169.4814 | test_epoch_time: 10.8180\n",
      "Epoch: 2 | train_loss: 0.1288 | train_acc: 0.9550 | test_loss: 0.4333 | test_acc: 0.8698 | train_epoch_time: 169.4503 | test_epoch_time: 10.8263\n",
      "Epoch: 3 | train_loss: 0.0950 | train_acc: 0.9672 | test_loss: 0.4867 | test_acc: 0.8650 | train_epoch_time: 169.6038 | test_epoch_time: 10.8199\n",
      "Epoch: 4 | train_loss: 0.0943 | train_acc: 0.9675 | test_loss: 0.3714 | test_acc: 0.8966 | train_epoch_time: 169.5757 | test_epoch_time: 10.8221\n",
      "Epoch: 5 | train_loss: 0.0537 | train_acc: 0.9821 | test_loss: 0.5002 | test_acc: 0.8701 | train_epoch_time: 169.5253 | test_epoch_time: 10.8426\n",
      "[INFO] Run 3 of 3 for compiled model\n",
      "Epoch: 1 | train_loss: 0.0705 | train_acc: 0.9751 | test_loss: 0.4333 | test_acc: 0.8839 | train_epoch_time: 169.4846 | test_epoch_time: 10.9057\n",
      "Epoch: 2 | train_loss: 0.0595 | train_acc: 0.9802 | test_loss: 0.4341 | test_acc: 0.8904 | train_epoch_time: 169.6055 | test_epoch_time: 10.8804\n",
      "Epoch: 3 | train_loss: 0.0405 | train_acc: 0.9859 | test_loss: 0.4478 | test_acc: 0.8901 | train_epoch_time: 169.5788 | test_epoch_time: 10.8449\n",
      "Epoch: 4 | train_loss: 0.0365 | train_acc: 0.9873 | test_loss: 0.5382 | test_acc: 0.8765 | train_epoch_time: 169.6732 | test_epoch_time: 10.9873\n",
      "Epoch: 5 | train_loss: 0.0422 | train_acc: 0.9854 | test_loss: 0.5057 | test_acc: 0.8832 | train_epoch_time: 169.6618 | test_epoch_time: 10.8969\n"
     ]
    }
   ],
   "source": [
    "# Create compiled model\n",
    "compiled_model = create_compiled_model()\n",
    "\n",
    "# Create an empty list to store compiled model results\n",
    "compiled_results_multiple_runs = []\n",
    "\n",
    "# Run compiled model for multiple runs\n",
    "for i in tqdm(range(NUM_RUNS)):\n",
    "    print(f\"[INFO] Run {i+1} of {NUM_RUNS} for compiled model\")\n",
    "    # Train the compiled model (note: the model will only be compiled once and then re-used for subsequent runs)\n",
    "    results = train_compiled_model(model=compiled_model, epochs=NUM_EPOCHS, disable_progress_bar=True)\n",
    "    compiled_results_multiple_runs.append(results)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Experiment 4 done!\n",
    "\n",
    "Now let's put the results together into a dataframe and take the mean across each of the runs (we'll do this by grouping by the epoch number, which is the index number of the dataframe)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>train_loss</th>\n",
       "      <th>train_acc</th>\n",
       "      <th>test_loss</th>\n",
       "      <th>test_acc</th>\n",
       "      <th>train_epoch_time</th>\n",
       "      <th>test_epoch_time</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.353548</td>\n",
       "      <td>0.876332</td>\n",
       "      <td>0.517181</td>\n",
       "      <td>0.840124</td>\n",
       "      <td>177.766548</td>\n",
       "      <td>14.111428</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.210781</td>\n",
       "      <td>0.927845</td>\n",
       "      <td>0.474630</td>\n",
       "      <td>0.856375</td>\n",
       "      <td>169.481367</td>\n",
       "      <td>10.838063</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.152098</td>\n",
       "      <td>0.947577</td>\n",
       "      <td>0.450293</td>\n",
       "      <td>0.872396</td>\n",
       "      <td>169.703638</td>\n",
       "      <td>10.833619</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.121230</td>\n",
       "      <td>0.958177</td>\n",
       "      <td>0.424376</td>\n",
       "      <td>0.884065</td>\n",
       "      <td>169.707751</td>\n",
       "      <td>10.904810</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.092080</td>\n",
       "      <td>0.968116</td>\n",
       "      <td>0.481520</td>\n",
       "      <td>0.870649</td>\n",
       "      <td>169.608708</td>\n",
       "      <td>10.854486</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   train_loss  train_acc  test_loss  test_acc  train_epoch_time  \\\n",
       "0    0.353548   0.876332   0.517181  0.840124        177.766548   \n",
       "1    0.210781   0.927845   0.474630  0.856375        169.481367   \n",
       "2    0.152098   0.947577   0.450293  0.872396        169.703638   \n",
       "3    0.121230   0.958177   0.424376  0.884065        169.707751   \n",
       "4    0.092080   0.968116   0.481520  0.870649        169.608708   \n",
       "\n",
       "   test_epoch_time  \n",
       "0        14.111428  \n",
       "1        10.838063  \n",
       "2        10.833619  \n",
       "3        10.904810  \n",
       "4        10.854486  "
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Go through compile_results_multiple_runs and create a dataframe for each run then concatenate them together\n",
    "compile_results_dfs = []\n",
    "for result in compiled_results_multiple_runs:\n",
    "    result_df = pd.DataFrame(result)\n",
    "    compile_results_dfs.append(result_df)\n",
    "compile_results_multiple_runs_df = pd.concat(compile_results_dfs)\n",
    "\n",
    "# Get the averages across the multiple runs\n",
    "compile_results_multiple_runs_df = compile_results_multiple_runs_df.groupby(compile_results_multiple_runs_df.index).mean() # .index = groupby the epoch number\n",
    "compile_results_multiple_runs_df"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4.3 Compare results of experiment 3 and 4\n",
    "\n",
    "Multi-run experiments done!\n",
    "\n",
    "Let's inspect the results.\n",
    "\n",
    "We can do so with our `plot_mean_epoch_times()` function we created before.\n",
    "\n",
    "This time we'll set the `multi_runs` parameter to `True` so that our plots reflect the fact we're plotting the results of multiple runs.\n",
    "\n",
    "We'll make sure we've got a directory to save the figure to as well. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Mean train epoch time difference: -7.443% (negative means faster)\n",
      "Mean test epoch time difference: -11.351% (negative means faster)\n",
      "[INFO] Plot saved to pytorch_2_results/figures/multi_run_NVIDIA_TITAN_RTX_ResNet50_CIFAR10_224_train_epoch_time.png\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAroAAAHOCAYAAAB3pqpTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAABLNklEQVR4nO3dd7gdZdWw8XsBgVAiNRZqAhJKIAQISCdKE2nSpIQSkKYgxfJSVAy+oqgogqJ54ZOmNKUoCCJSQlEQEwglFGlBQg2BQCgJJKzvj5kTdk5OzTnJJJP7d13nOntPeWbN7Jln1n7mmdmRmUiSJEl1s0DVAUiSJEmzg4muJEmSaslEV5IkSbVkoitJkqRaMtGVJElSLZnoSpIkqZZMdCXNUyIiI+LTbYwfExGDO1jW2IjYtrtimxe1tz0laV5moitpjiiTyvcjYrlmw0eXyVafWSjzooj4QeOwzOyfmSO6Fu38qaXtKUnzMhNdSXPSs8B+TW8iYl1g0erC6R4RsdC8XL4k1ZWJrqQ56XfAQQ3vDwYuaZwgIkZExGEN74dGxN3NC4qII4AhwP9ExNsRcX05fHp3hIgYFhFXRcSVETEpIu6PiPVaCiwiFoiIkyLi6YiYEBF/iIhlWpl2cESMi4gTI+Jl4MKW4mzsFlC2lp4bETeUsfwrIlZrpfw+5bxfjoj/AreVww+NiMci4o2I+FtErFIOj4g4KyJejYg3I+KhiFinm7bniRHxQhnzExGxTUsxS9LcyERX0px0L/CxiFgrIhYE9gF+PysFZeZ5wKXATzJziczcpZVJdwP+CCwDXAb8KSJ6tDDdscAXga2B5YE3gHPbCOGTZZmrAEd0MOz9gNOApYGngNPbmX5rYC1gh4j4InAKsAfQG7gLuLycbntgK6AfsBTFdp3QwZiAlrdnRKwBHANslJm9gB2AsZ0pV5KqZKIraU5ratXdDngceGE2L29UZl6VmR8APwd6Apu0MN2RwLczc1xmTgGGAXu10W3gQ+B7mTklM9/rYCzXZOZ9mTmVIqkc2M70wzLznbL8I4EfZeZj5fw/BAaWrbofAL2ANYEop3mpgzG1ZRqwCLB2RPTIzLGZ+XQ3lCtJc4SJrqQ57XfA/sBQmnVbmE2eb3qRmR8C4yhabJtbBbg2IiZGxETgMYpE7xOtlDs+Myd3MpaXG16/CyzRzvTPN7xeBTi7Ib7XgQBWyMzbgF9RtEC/EhHnRcTHOhnbTDLzKeB4iqT/1Yi4IiJa2naSNFcy0ZU0R2XmcxQ3pX0BuKaFSd4BFmt4/8m2iuvAIldqehERCwArAi+2MN3zwI6ZuVTDX8/MbK3FufmyZ4g7ItqKu6Mal/E8cGSz+BbNzH8CZOY5mbkh0J+iC8O3WoqLTm7PzLwsM7egSLQT+PGsr44kzVkmupKq8GXgc5n5TgvjRgN7RMRi5Y1cX26jnFeAVdtZ1oYRsUfZBeF4YApFX+HmhgOnN9zg1Tsidmun7EYPAv0jYmBE9KRoBe1Ow4GTI6J/Gd+SEbF3+XqjiPhM2ff4HWAyRWs0dGF7RsQaEfG5iFikLPO9hnIlaa5noitpjsvMpzNzZCujzwLep0i6Lqboy9qa31L0H50YEX9qZZo/U9yc9QZwILBH2V+3ubOB64CbI2ISRTL8mfbWpUlm/gf4PnAL8CQw05MNuiIzr6VoTb0iIt4CHgF2LEd/DDifYh2fo7gR7cxyXFe25yLAGcBrFN0uPk5xQ5wkzRMisyNX/iRp3hMRw4BPZ+YBVcciSZrzbNGVJElSLZnoSpIkqZbsuiBJkqRaskVXkiRJtWSiK0mSpFoy0ZUkSVItmehKkiSplkx0JUmSVEsmupIkSaolE11JkiTVkomuJEmSaslEV5IkSbVkoitJkqRaMtGVJElSLZnoSpIkqZZMdCVJklRLJrqSJEmqJRNdSZIk1ZKJriRJkmrJRFeSJEm1ZKIrSZKkWjLRlSRJUi2Z6EqSJKmWTHQlSZJUSya6mq0iYmxE9Kk6jiYRMSwifl91HAAR8Y+IWL/qONoSEbtGxBXtTDMsIobNoZDUioi4KCKGzuFlDo2Iuzs47UUR8YPZHdO8rNyeF1Udx+wSEWMiYnDVcXSXueX81pnjsJ1yTomI/9cdMc1NOpToRsS+EfGviHgnIl4tX381IqIcf1FEvB8Rb0fE6xHx94hYs2HcD5qV1yciMiIW6uDyMyIejogFGob9oCy7Z0RMjIjPtTDfWRFxVfl6bERsW74eGhHTynjfjohnI+LCiOjXXozlST0jYuM24h3eUPb7EfFBw/u/NpZdvm8a90HDdnw7IoaX5S1evr+xhWWNjYhXImLxhmGHRcSIVmJrWnbTMsZGxEnluDENw6dFxOSG96dExNci4pGIWLihvOMj4oGOfpbNYmn+OTT9Ld/ZsqpW7hcfNFuPVduYfhdgUmY+UL7fNyKeiIg3y2Ps4oj42BxbgVZk5nXAOhExYFbLKPe3d8ptMiEibo2IfTox/+CIGDery2+lzIXLz+zJMraxEXFBlCetiBgREYc1LP/DZp/t9Q1ltVgnNNu/34qIByNi52bTnFd+7h9GC0lqRJwQES+X+8UFEbHILK5v03F/f7Phy5V1zthZKXd26ci2m8VyO1VftjB/S+ezEc3qyieajd8mIh6PiHcj4vaIWKWL8W87q/NXLTP7Z+aI2VV+RHw8Ii6PiBfLY+YfEfGZhvE7RcTdUeQML0fE+RHRq4VylomI8dGF5LGF89szEfGVTsw/x78YZuYPM/Ow2bmMiPhseRy82bzeae/zK6f5WhQ521sRMTIitmhvme0muhHxDeBs4KfAJ4FPAEcBmwMLN0z6k8xcAlgReBW4qL2yO2l5YN/mAzNzMnAlcFCzuBcE9gMubqW8e8p4lwS2Bd4DRkXEOq0FEBEBHAi8Dhzc2nSZeVRmLlGW/0Pgyqb3mbljs2l3bJj2UsrtWP4dVU62FzAF2D4iPtXCIhcCjmstnlYsVS5zL+C7EbFdWQk1xXIXcExDLD8EzgUmAt8ut8eqwGnAlzNzaieX3+SehmU0/b04i2VV7cpm6/FMG9MeBfyu4f0/gM0zc0lgVYrPtEOVXMzCl4xOuhw4ootlrFfuV2tQ1A2/iojvdTWwLrgK2BXYn6IOWA8YBWzTyvQvNvtsd4EO1QlN9cxSwK+BKyJiqYbxDwJfBe5vPmNE7ACcVMbUh2K/OK1TazmzxZvVcfsDz3axzNmlvW03q2alvmxPY125RtPAiFgOuAb4LrAMMJLifKXZYwng38CGFNv7YuCGiFiiHL8kRb26PLAWRb7y0xbK+THwWDfEc0/DOXUv4Ccxl1/FmwPeAS4AvtXCuDY/vzLpPYNiWy4J/Ba4tsz3WtVmohsRSwLfB76amVdl5qQsPJCZQzJzSvN5MvNd4DKg1YRxFv0EOK2Vk/rFwJ4RsVjDsB0o1u+vbRWamdMy8+nM/CpwBzCsjcm3pDhAjgP2jYaWzdnsYGA48BAwpIXxPwW+OSsngcwcCYwBBnZg2g+BLwMnRNHCdz7w68yc6STdHcrWi5Mj4tGIeCOKVveeDeMPj4inoriKcF00tARHRP8oriy8XrbgnNJQ9MIRcUlETIqiFXtQw3wnRsQL5bgnIqK1xKcr67Uw8DmK/Q2AzHw+M19rmGwa8Ok2ysiIODoingSejBauQMSMrZJDy5aMM8tt+WxE7Ngw7dCyxWFSOa5xPxsB7NTV9S7X87XM/B3wFeDkiFi2XP4hEfFYufxnIuLIcvjiFMfw8g0tI8tHxMYRcU/ZMvNSRPyqo8djFC1i2wG7Zea/M3NqZr6Zmedm5m87uUodqhPKY+d3wOLA6g3Dz83MW4HJLcx2MPDbzByTmW8A/wsM7WR8zf2OGRPyg4BLGieIiLXKfWdieXzs2jBu2fJYeysi7gNWazbvmg3H3RMR8aUuxtvitouIRcp9+b/l8T08IhYtxy0XEX8p4389Iu6KhquBtFNftrYOEXEERf37P9GsVb8NewBjMvOPZaPMMGC9KK94dkV5zP4jiiuXE8vjZrNy+PNRXBk6uGH6naK4+vZWOX5Ys/IOiojnorjq8t2Y8SroAhFxUkQ8XY7/Q0Qs00pcrW7/ZmVObDim3ynrrz7luJ0jYnQ5zT+jg1eUMvOZzPx5Zr5UntvPo2iQW6Mcf1lm3pSZ75bH1PkUjXaN8W9Kkb9c2JFldlR5nnyMIsFuWtYf46MrNndGRP9yeIv7WkSsFBHXRNHaPCEiftUs9hbr9+ailfNcNHTtK+vUxqtYU5v2mbIOvrqM49mIOLYT2+G+8hwwU0NQe58fxRf+MZk5KjOTou5aDvh4W8tsr0V3U2AR4M8dXYkoMu8hwAOdmOfXEfHrdia7BniLFir6zPwn8BJFpdLkQOCyTrY0XkNx4mrNwcD1fPSNvMuX0toTESsDgylaey+lWct1aSRFMvLNWSh/E4qD+qmOTJ+ZTwA/Am6j+Dbc1Ram9gyh+NKyGtAP+A5AFF1VfgR8CfgU8BxwRTmuF3ALcBNFEvJp4NaGMnctp10KuA74VTnfGsAxwEaZ2atc7thy3BYRMbGdWHcpK/Yx0fYlqtWBDzNzhsvx5TLeBCYBewK/aGd5XwQ+A6zdznRNPgM8QVEx/AT4bRQWB84BdizXezNgdMN8jwF9onu7UvyZomWt6XL/qxTH08eAQ4CzImKDzHwH2JEZW1RfpPgicEK5LptStHp+tanw8kR7UivL3ha4LzOf74b16FCdEEWLwyHABxT7akf0p2jxbfIg8IkovxzMot9TJOQLRsRaQC/gXw1x9qBYn5spTh5fAy4tjw0orupMpjjmDi3/muZdHPg7RUPHxymuqP266eTdXJnEtHvZsZVt92OK+mAgxfG9AnBqOe4bwDigN8UVyFOAbCiy1fqyrXUoT7qNV912aZj1RxHxWpl4Dm4YPsNnWO7PT5fDu8NnKBpAli1jvgLYiGKbHEBx5aSpNfMdivPHUhRfXL8SEV8s13ttilbzIRSf7ZIU27TJsRT1zdYUdeobFPtCS9rb/gBk5lINrZ1nU1xFfCEiNqBo8TuyXK//A66LsttOB/MFymkHUiRKrZ3ftqJo6GmafsFyvY5pKeauiIiNKPbZkQ2D/0pxPvg4xVWdSwFa2tfK2P5CcQz0ofh8Gu+faLF+byGOVs9zjTLzmIbPZwuKz/zP5ZeW6yn26xUo6t7jo7gC1dFzZYe08Pn9FVgwIj5Tbo9DKc5VL7dVTnuJ7nLAa43JYvntamJEvBcRWzVM+81y5Z6iaH4e2tGVycyvli2qbU5Gcfnn1Gi5n9ollElgeULejda7LbTmRYrm8plE0Vq8N0Xy/AHFpc9Wuy90o4OAhzLzUYpLyP2j5UsfpwJfi4jeHSz3tYh4D7iHooL7UydiuouiArqqbKXoik3K/anp7+lm439Vtna+DpxOceKBokK+IDPvL68snAxsWrYI7Ay8nJk/y8zJ5ZWIfzWUeXdm3piZ0yhaitYrh0+j+GK3dkT0yMyxmfk0QGbenZlLtbEef6D4pt4bOJxiP92vlWmXokhmZ1AuY0k+upw2to3lAfwoM1/PzPfama7Jc5l5frneF1Oc0D5RjvuQoi/uouW36TEN8zXFulQHl9Ou8hh6jfJ4y8wbyisrmZl3UCRarX7pLL/R31u2xo6lOBlu3TB+58w8o5XZl6X4YtwZyzfbT7/UwTphk7JenAycCRyQma92cJlLAG82vG96PVOfwk4YR3Ey3JYi1kuajd+kXO4Zmfl+Zt5GcXLdrzyx7AmcmpnvZOYjzFjH7gyMzcwLy8/lfuBqisuMMykTnbb6QLa47cqT9+HACeX+P4mii1hT17YPKPbtVTLzg8y8q2z9adRafdmpdSidSNGtZAXgPOD6iGhq6W7+GVK+78pn2OjZMtZpFF+2VgK+n5lTMvNm4H3KK0OZOSIzH87MDzPzIYrzSdMxsxdwfVkHvU+xfRq32ZHAtzNzXFnfDgP2ipavsHZk+08XRX/9/YE9y+PocOD/MvNfZavexRRd9zYp16Mj+UJTHvA74LTMbP4ZEBHbURwDpzYMPhb4V2aOaq/8Dmo6v70N3FfG82TTyMy8oDw/NW3T9aK4kt6SjSm+ZHyrPP4mNzt+2qrfG7V6nmtJeYz8CfhaFveUbAT0zszvl3XEMxQt4/uW69TeubJDWvn8JlEcj3dT7BPfA45oa/+C9hPdCcByjTtzZm5WrsSEZvOfWVZcn8zMXRs23FSgR7Nye1CcWD9sZ/kzyMwbgf/Scn/BS4DPRsQKFAftU+WH0hkrUPS1a8nuFOvSdEPYpcCOnUgsZ9VBfPQt70WKy90zJdjlSecvFH36OmI5ikr4mxQtxs0/oxZFcWn2/4BfAsdEGzdcddC95X7T9Ldas/GNrW7PURzolP+nt4xl5tsU++QKFJV9qwcuM377exfoGRELZeZTwPEUFc6rEXFFdPDGuMx8NDNfLCvmf1K0ULR2cnyDNk50mfkCRWt0m087YMZt0xHT1zuLLkYAS2TRyrQPRb/hlyLihpjx0mpTrBM7ubxWlS2HvSmPt4jYMSLuLVvEJwJfoNhHW5u/X9lq+3JEvEWR6LQ6fTMTKE4CnfFis/30D3SsTri3rC+Xprh60NYVo+bepmjhbtL0eqYvSZ10CUVDxH4ULbyNlgeez6K7QJPnKI6r3hSt8M2PySarAJ9p/EJA8YX0k7MYZ2vbrjewGMU9FU3LuakcDsWXxKeAm6O4nD9TndhGfdnpdSgTskllcnkxRX/7L5Sjm3+GlO+7+hk2eaXh9XtlPM2HTe/fGMVNQOPLK0dH8dExszwNn2tZP0xoKGcVir6QTdvkMYqEqaVEqt3t36RstPkVsHtmjm9Y1jeafQYr8VHd364ourFcT7EP/aiF8ZtQtIDvlZn/KYctT5Hofrujy+mApvPbEhT7UH+KuoryqsoZUXQHeYuPGjZaq8dWokhmW7tK3WL93nyizpznynr6Koov803no1Vo9sWfotW+pX1hlrTx+R1G0Yrbn6Kl9wDgL+2dp9tLdO+hyJp3m+WIi8S0T7NhfZm5Mu2o71DsiI39ccnM/1K0NA6h6LbQvKWiI3Yvy2jJwRQ7zX8j4mXgjxTJYWutdl0WEZtRXNY4uTyhv0xxeWK/Vr5Jf4/i2/AKLYybSZmU/YyixaTdb8il71JcZj6Oot/w/3Vwvlm1UsPrlSla3Sn/T797OYpLjssCL1BU2M0T5g7Jog/XFmXZSXGJdJaKAma6bFR6kuI+prY+p4Vofx0av8W+U/5vPC46nGBk5t8yczuKBPBxim/oTdaiaOV6q6PldcBuFEnifeUVmqspWu0+USY3N/LR9mvp2/pvyjhXz8yPUVS0rW3v5m4BNo6IFWc9fKATdUL5ReyrwIGtXJFpyRg+utpA+fqVzJzQyvQddTXFpetnMrN5N4oXgZVixj6tK1McV+MpPrPmx2ST54E7mn0hWCIzO3yneUta2HavUSRw/RuWs2SZTFAmnd/IzFWBXYCvR8t97VuqL9tbh45czm489mf4DMt6ajUaLpfPQZdRfGFYKYsrR8P5KM6XKK4kAdMTjcYuMs9TdG1q3C49yy/lM+jo9i+/EF5LcSNfY6PU88DpzZa1WGZe3pGVLOuTP1Hss0e2MH79cjscmkX/+CYbU9R/j5bH89kU9cTL0c7NTh1RfgG5mmKbQNGKvRvF1ZUl+ShPaq3eex5YuZVzf2dj6eh57pcUX8q+0yyOZ5t9Pr0y8wstF9E57Xx+61FcefhPFlcmbqLYdzdrq8w2E93MnEjRB/PXEbFXRCwRRaf0gRQ3BnTE1cBOEbF9+Q1meYqN1l5rVWsxjQAepuVuAxdT9D3ZnLIVtD1lTH0j4pcULZsz9TktE5JtKC5rDSz/1qPYOWZn94WDKfqLrd2w3HUokpmZOpqX39SupPhW2hlnUHR679nWRBGxXln24eWlgmEUfTcP6eTyOuPoiFgxihsfTuGjvpCXAYdExMDywPghxSWnsRQtNZ+M4tFni0REr2j2iJKWRMQaEfG5srzJFCfTaR0JMiJ2i4ilo7AxxXZqsW97FpfnbqHhUntEDImIlcv5V6HopnFrS/O3UuZ4iorhgHKfPpQOJvsR8Ykonpe7OMUX27eZcb23pp2bOjsqisf2DKHoB/fjMmlbmOJS2nhgahQ3UWzfMNsrwLIx4yW9XhR99t8uW587nExl5i0Ux9W1EbFhFI/56xURR5XbrSPr0ek6oVzX/0fDpdIoHnPWk+Lk1iOKxyU21cuXAF+OiLUjYmmKevOijq5na8oW/M9RtI409y+KL03/ExE9ouhvugtwRRaXRK8BhkXEYlH062xc178A/SLiwHLeHhGxURR9gbsa8/RtVzaQnE/Rj/vjUHwe8VEfwZ0j4tMRERT7yDRaOI5bqS/bW4dXKLopUC5rqYjYofzcFir37a2Av5WTXEvRJWjP8nM+laIr2uNd3SazoBfwemZOLuuo/RvGXUVxj8FmUVy1O40ZvzgOB04v6yYiondEtNgA1pHtXyZrVwOXZmbzp1CcDxwVRQt0RPF4zZ2ihceAtbDsphbI94CDmjemRfHEkZsoLsM3v5nwrxTJ5sDy71SKe40Glvt+l0TRt353PvqS04uivp1AcU7/YbNZZtjXKLo+vAScUW6TnhGxOZ3U0fNcFDcEbw3s32w73ge8FcUNbYuW55t1ouiD3JHlL1AeCz2Kt9Gz3Ofa/fwonsiwU0SsWu4b21H0e36kzYVmZrt/FK2k91Fc5h1PURkeASxcjr8I+EEb8+9C8eieNykudf0UWLRh/HBgeBvzJ/DphvefKYdd1Gy6xSm+ffy1hTLGAtuWr4dSfLBvU1Tqz1EkyWs1TN+nXMZCFJe3RrVQ5vIU/ZHWaSP2YcDvmw2bXnaz4dO3I9CT4hL3Li2U+WuK/rEzrFf5fiWKnXdEK/HMtGyKCm0MxcHfNGwEcFjD+wUpOtH/T7PyBlO0sHyileWNBfq0Mq7xc2j826hh3pOBRykum18MLNYw/1EUXRRepzhBrdgwbh2KRPENiks6J7X0eTT7nAdQ7OeTGspcvpxuS+DtNj7nyykqrLcpWhqPbeeY2omG/ZQisR1HsT+Oo+jrt2xHj4ly2I4Uj4qaCPyMopvLYQ3b+u6WyqBoxbiD4vicWH72azdM9zDF48Ha2seHtRPrO+W2eR24naLybJzmaIqKfSJFv6wraKhTKG5OmVCOX54imXi8LPMuiqfD3N0w/V+BU9qIqelk/hQf1QH/D1i5+f5PsY+PazZ/u3VCK9t8RYqT24CG5WSzv8EN03+93C5vUdwFvkgb63QRMLSjx33DuG0pWuyb3vdv2B8epbis3DSuN8Vx8RbFsfK/zbb7GsANFOeJCRQ3rQ5siK/xM30b2LKNuqHVbUdRP/6Q4s7ttygupR9bTncCRd3RdCx9t1l91GZ92c46rE5x88tEilan3hQn30nlsHuB7VrYvo9TnLxH0Ep92LDeF7Uxfnr8zbcRxbGczaYfB2xRvt6LYj+fVH6Gv2LGunAoxRXYCRRX7l5o+nwoGsW+TtG/exJFvfvDVmJsd/vz0f7YVC80/TUdf58vt+tEiuTuj0Cvclyr+QJFYpYUuUpjuU3rcSFFl8nGcWM6ug+28nl09Pz2KsV54uPl+CUoGkMmlZ/LQTTU6zTb18phK1PsdxMozrvntHG8zHSOKIe3dZ4b1rRPUOyrTQ0fTX+nNNRzl1OcW9+g2O+b9sv2zpWDmbnOG9HBzy8o6vr/lvE/BhzY1meUmUQ5szRbRPFA6MFZtLTOyryHZdECVztRPIy8qYP/XCmKH7Y4MDNbfUxUlI+cycxhcygstSCKX9QakZkXVRyKZlEUPxoyODOHVhzHEhQJ1uqZ+WyVsczNunJ+05wzux80L6kVWfSRmqtlcXmvI88LlTQPK7/U3krRanYmxZWcsVXGJHUHE13Nbr+gG+/W11xpRNUBCCguaY6tOAZ1zWiqqy93o+g2FBTd1PZNL/m25xd4fpvr2XVBkiRJtdTe48UkSZKkeZJdF1qx3HLLZZ8+faoOQ5IkqV2jRo16LTNn949YzXNMdFvRp08fRo4c2f6EkiRJFYuI5j8AI+y6IEmSpJoy0ZUkSVItmehKkiSpluyjK0nSHPbBBx8wbtw4Jk+eXHUomsf07NmTFVdckR49elQdyjzBRFeSpDls3Lhx9OrViz59+hARVYejeURmMmHCBMaNG0ffvn2rDmeeYNcFSZLmsMmTJ7Psssua5KpTIoJll13WKwGdYKIrSVIFTHI1K9xvOsdEV5IkSbVkH11JkirW56QburW8sWfs1K3lSfMqW3QlSdI87Qtf+AITJ04EYIkllujUvMOGDePMM8+cDVG1rSnOsWPHctlll83x5c8vTHQlSdI87cYbb2SppZaa7cuZOnVqt5dpojt7mehKkjQfGjt2LGuttRaHH344/fv3Z/vtt+e9995j9OjRbLLJJgwYMIDdd9+dN954A4DBgwdz4oknsvHGG9OvXz/uuuuuVsueNm0a3/zmN1l33XUZMGAAv/zlLwG49dZbWX/99Vl33XU59NBDmTJlCgB9+vThlFNOYdNNN2XQoEHcf//97LDDDqy22moMHz4cgBEjRrDVVlux++67s/baa3PUUUfx4YcfTp//tddemymOn/70p2y00UYMGDCA733ve9OHn3766ayxxhpsu+22PPHEE21up8GDB3PKKaew9dZbc/bZZzNq1Ci23nprNtxwQ3bYYQdeeuklAM455xzWXnttBgwYwL777gvM3Fq8zjrrMHbs2BnKP+mkk7jrrrsYOHAgZ511FmPGjGHjjTdm4MCBDBgwgCeffLLN+NQ2E11JkuZTTz75JEcffTRjxoxhqaWW4uqrr+aggw7ixz/+MQ899BDrrrsup5122vTpp06dyn333ccvfvGLGYY3d9555/Hss8/ywAMP8NBDDzFkyBAmT57M0KFDufLKK3n44YeZOnUqv/nNb6bPs9JKK3HPPfew5ZZbMnToUK666iruvfdeTj311OnT3HffffzsZz/j4Ycf5umnn+aaa65pNYabb76ZJ598kvvuu4/Ro0czatQo7rzzTkaNGsUVV1zBAw88wDXXXMO///3vdrfTxIkTueOOOzj22GP52te+xlVXXcWoUaM49NBD+fa3vw3AGWecMX19m5LzjjjjjDPYcsstGT16NCeccALDhw/nuOOOY/To0YwcOZIVV1yxw2VpZt6MJknSfKpv374MHDgQgA033JCnn36aiRMnsvXWWwNw8MEHs/fee0+ffo899pg+bfOWyUa33HILRx11FAstVKQZyyyzDA8++CB9+/alX79+08s+99xzOf744wHYddddAVh33XV5++236dWrF7169aJnz57T+99uvPHGrLrqqgDst99+3H333ey1114txnDzzTdz8803s/766wPw9ttv8+STTzJp0iR23313FltssRmW25Z99tkHgCeeeIJHHnmE7bbbDiharj/1qU8BMGDAAIYMGcIXv/hFvvjFL7ZbZms23XRTTj/9dMaNG8cee+zB6quvPstlyRZdSZLmW4ssssj01wsuuOD0hLK96RdccME2+6tm5kzPe83MDpW9wAILzBDXAgssMH1Zzcts65mymcnJJ5/M6NGjGT16NE899RRf/vKX252vJYsvvvj0Mvv37z+9zIcffpibb74ZgBtuuIGjjz6aUaNGseGGGzJ16lQWWmih6d0rgA790MP+++/Pddddx6KLLsoOO+zAbbfd1qlYNSNbdCVJqtjc8jiwJZdckqWXXpq77rqLLbfckt/97nfTW3c7Y/vtt2f48OEMHjyYhRZaiNdff50111yTsWPH8tRTT/HpT396lsq+7777ePbZZ1lllVW48sorOeKII1qddocdduC73/0uQ4YMYYklluCFF16gR48ebLXVVgwdOpSTTjqJqVOncv3113PkkUd2aPlrrLEG48eP55577mHTTTflgw8+4D//+Q9rrbUWzz//PJ/97GfZYostuOyyy3j77bfp06cPf/nLXwC4//77efbZZ2cqs1evXkyaNGn6+2eeeYZVV12VY489lmeeeYaHHnqIz33uc53aTvqIiW7FuvvZieq4ueXEIklzk4svvpijjjqKd999l1VXXZULL7yw02Ucdthh/Oc//2HAgAH06NGDww8/nGOOOYYLL7yQvffem6lTp7LRRhtx1FFHdarcTTfdlJNOOomHH354+o1prdl+++157LHH2HTTTYHicV6///3v2WCDDdhnn30YOHAgq6yyCltuuWWHl7/wwgtz1VVXceyxx/Lmm28ydepUjj/+ePr168cBBxzAm2++SWZywgknsNRSS7HnnntyySWXMHDgQDbaaKPp3TYaDRgwgIUWWoj11luPoUOHMnnyZH7/+9/To0cPPvnJT87QR1mdF+1dSphfDRo0KEeOHDnbl2OiWx0TXUlVeeyxx1hrrbWqDmOeMmLECM4888zpLaTzs5b2n4gYlZmDKgpprmUfXUmSJNWSXRckSdIs+dvf/saJJ544w7C+ffty7bXXdvuyBg8ezODBg7u93CZHH300//jHP2YYdtxxx3HIIYfMtmVq9jPRlSRJs2SHHXZghx12qDqMbnHuuedWHYJmA7suSJIkqZZMdCVJklRLJrqSJEmqJfvoSpJUtWFLdnN5b3ZveV106qmnstVWW7HtttsyePBgzjzzTAYN6tiTsKp6rFhjnD/84Q855ZRT5ujy1T1s0ZUkSbPV97//fbbddtvZvpy2fpa4K374wx/OlnI1+5noSpI0n7rkkksYMGAA6623HgceeCDPPfcc22yzDQMGDGCbbbbhv//9LwBDhw7lK1/5Cp/97GdZddVVueOOOzj00ENZa621GDp06PTyllhiCb7xjW+wwQYbsM022zB+/Pjp81911VUzLf/mm29m0003ZYMNNmDvvffm7bffBuCmm25izTXXZIsttuCaa65pcx2GDRvGEUccwfbbb89BBx3E+PHj2XPPPdloo43YaKONpj8y7I477mDgwIEMHDiQ9ddfn0mTJjFixAh23nnn6WUdc8wxXHTRRTOUf9JJJ/Hee+8xcOBAhgwZwjvvvMNOO+3EeuutxzrrrMOVV17Z6e2uOcdEV5Kk+dCYMWM4/fTTue2223jwwQc5++yzOeaYYzjooIN46KGHGDJkCMcee+z06d944w1uu+02zjrrLHbZZRdOOOEExowZw8MPP8zo0aMBeOedd9hggw24//772XrrrTnttNNaXf5rr73GD37wA2655Rbuv/9+Bg0axM9//nMmT57M4YcfzvXXX89dd93Fyy+/3O66jBo1ij//+c9cdtllHHfccZxwwgn8+9//5uqrr+awww4D4Mwzz+Tcc89l9OjR3HXXXSy66KId2k5nnHEGiy66KKNHj+bSSy/lpptuYvnll+fBBx/kkUce4fOf/3yHylE1THQlSZoP3Xbbbey1114st9xyACyzzDLcc8897L///gAceOCB3H333dOn32WXXYgI1l13XT7xiU+w7rrrssACC9C/f3/Gjh0LwAILLMA+++wDwAEHHDDD/M3de++9PProo2y++eYMHDiQiy++mOeee47HH3+cvn37svrqqxMRHHDAAe2uy6677jo9cb3llls45phjGDhwILvuuitvvfUWkyZNYvPNN+frX/8655xzDhMnTmShhWbtNqV1112XW265hRNPPJG77rqLJZfs5v7V6lbejCZJ0nwoM4mINqdpHL/IIosARTLb9LrpfWt9Y9sqPzPZbrvtuPzyy2cYPnr06Hbjam7xxRef/vrDDz/knnvumanF9qSTTmKnnXbixhtvZJNNNuGWW25hoYUW4sMPP5w+zeTJk9tdVr9+/Rg1ahQ33ngjJ598Mttvvz2nnnpqp+LVnGOLriRJ86FtttmGP/zhD0yYMAGA119/nc0224wrrrgCgEsvvZQtttiiU2V++OGH0/viXnbZZW3Ov8kmm/CPf/yDp556CoB3332X//znP6y55po8++yzPP300wAzJcLt2X777fnVr341/X1Tt4qnn36addddlxNPPJFBgwbx+OOPs8oqq/Doo48yZcoU3nzzTW699dYWy+zRowcffPABAC+++CKLLbYYBxxwAN/85je5//77OxWf5ixbdCVJqloFjwPr378/3/72t9l6661ZcMEFWX/99TnnnHM49NBD+elPf0rv3r258MILO1Xm4osvzpgxY9hwww1Zcskl27xRq3fv3lx00UXst99+TJkyBYAf/OAH9OvXj/POO4+ddtqJ5ZZbji222IJHHnmkwzGcc845HH300QwYMICpU6ey1VZbMXz4cH7xi19w++23s+CCC7L22muz4447ssgii/ClL32JAQMGsPrqq7P++uu3WOYRRxzBgAED2GCDDTjooIP41re+xQILLECPHj34zW9+06ltpDkrMrPqGDotIi4AdgZezcx1ymFXAmuUkywFTMzMgRHRB3gMeKIcd29mHtXeMgYNGpQjR47s7tBn0uekG2b7MtSysWfsVHUIkuZTjz32GGuttVbVYXS7JZZYYvqTEzT7tLT/RMSozOzYw4nnI/Nqi+5FwK+AS5oGZOY+Ta8j4mdA49fjpzNz4JwKTpIkSdWbJxPdzLyzbKmdSRQ92L8EfG6OBiVJ0nxudrbmXnjhhZx99tkzDNt8880599xzZ9syNe+bJxPddmwJvJKZTzYM6xsRDwBvAd/JzLuqCU2SJM2KQw45hEMOOaTqMDSPqWOiux/QeIvmS8DKmTkhIjYE/hQR/TPzreYzRsQRwBEAK6+88hwJVpI0f+rI472k5ubFe6uqVKvHi0XEQsAewPTbPDNzSmZOKF+PAp4G+rU0f2ael5mDMnNQ796950TIkqT5UM+ePZkwYYJJizolM5kwYQI9e/asOpR5Rt1adLcFHs/McU0DIqI38HpmTouIVYHVgWeqClCSpBVXXJFx48Yxfvz4qkPRPKZnz56suOKKVYcxz5gnE92IuBwYDCwXEeOA72Xmb4F9mbHbAsBWwPcjYiowDTgqM1+fk/FKktSoR48e9O3bt+owpNqbJxPdzNyvleFDWxh2NXD17I5JkiRJc5da9dGVJEmSmpjoSpIkqZbmya4LUrcYtmTVEcy/hr3Z/jSSJHWRLbqSJEmqJRNdSZIk1ZKJriRJkmrJRFeSJEm1ZKIrSZKkWjLRlSRJUi2Z6EqSJKmWTHQlSZJUSya6kiRJqiUTXUmSJNWSia4kSZJqyURXkiRJtWSiK0mSpFoy0ZUkSVItmehKkiSplkx0JUmSVEsmupIkSaolE11JkiTVkomuJEmSaslEV5IkSbVkoitJkqRaMtGVJElSLZnoSpIkqZZMdCVJklRLJrqSJEmqJRNdSZIk1ZKJriRJkmrJRFeSJEm1ZKIrSZKkWjLRlSRJUi2Z6EqSJKmWTHQlSZJUSya6kiRJqiUTXUmSJNWSia4kSZJqyURXkiRJtWSiK0mSpFqaJxPdiLggIl6NiEcahg2LiBciYnT594WGcSdHxFMR8URE7FBN1JIkSZqTFqpy4RHxcWBzYHngPeARYGRmftjOrBcBvwIuaTb8rMw8s9ky1gb2BfqXy7klIvpl5rSur4EkSZLmVpW06EbEZyPib8ANwI7Ap4C1ge8AD0fEaRHxsdbmz8w7gdc7uLjdgCsyc0pmPgs8BWzcpRWQJEnSXK+qFt0vAIdn5n+bj4iIhYCdge2AqztZ7jERcRAwEvhGZr4BrADc2zDNuHKYJEmSaqySFt3M/BYwLiK+1MK4qZn5p8zsbJL7G2A1YCDwEvCzcni0FEJLBUTEERExMiJGjh8/vpOLlyRJ0tykspvRyn64X+vG8l7JzGlluefzUfeEccBKDZOuCLzYShnnZeagzBzUu3fv7gpNkiRJFaj6qQs3R8Q3I2KliFim6W9WCoqITzW83Z3ixjaA64B9I2KRiOgLrA7c17WwJUmSNLer9KkLwKHl/6MbhiWwalszRcTlwGBguYgYB3wPGBwRA8v5xwJHAmTmmIj4A/AoMBU42icuSJIk1V+liW5m9p3F+fZrYfBv25j+dOD0WVmWJEmS5k2Vdl2IiMUi4jsRcV75fvWI2LnKmCRJklQPVffRvRB4H9isfD8O+EF14UiSJKkuqk50V8vMnwAfAGTme7T8ODBJkiSpU6pOdN+PiEUpn2sbEasBU6oNSZIkSXVQ9VMXhgE3AStFxKXA5sAhlUYkSZKkWqj6qQs3R8QoYBOKLgvHZeZrVcYkSZKkeqj6qQu3ZuaEzLwhM/+Sma9FxK1VxiRJkqR6qKRFNyJ6AotR/ODD0nx0A9rHgOWriEmSJEn1UlXXhSOB4ymS2lF8lOi+BZxbUUySJEmqkUoS3cw8Gzg7Io7NzHMax0XEIlXEJEmSpHqp+vFiQ1sYds+cDkKSJEn1U1Uf3U8CKwCLRsT6zNhHd7EqYpIkSVK9VNVHdweK1twVgZ83DH8LOKWKgCRJklQvVfXRvRi4OCL2zMyrq4hBkiRJ9VZ1H91/RMRvI+KvABGxdkR8ueKYJEmSVANVJ7oXAn/jo2fn/ofisWOSJElSl1Sd6C6XmX8APgTIzKnAtGpDkiRJUh1Unei+ExHLAgkQEZsAb1YbkiRJkuqgqqcuNPk6cB2wWkT8A+gN7FVtSJIkSaqDShPdzLw/IrYG1qB4lu4TmflBlTFJkiSpHipNdCOiJ/BVYAuK7gt3RcTwzJxcZVySJEma91XddeESYBLwy/L9fsDvgL0ri0iSJEm1UHWiu0Zmrtfw/vaIeLCyaCRJklQbVT914YHySQsARMRngH9UGI8kSZJqopIW3Yh4mKJPbg/goIj4b/l+FeDRKmKSJElSvVTVdWHnipYrSZKk+UQliW5mPlfFciVJkjT/qLqPriRJkjRbmOhKkiSplipNdCNi8YhYoHzdLyJ2jYgeVcYkSZKkeqi6RfdOoGdErADcChwCXFRpRJIkSaqFqhPdyMx3gT2AX2bm7sDaFcckSZKkGqg80Y2ITYEhwA3lsKp/rU2SJEk1UHWiezxwMnBtZo6JiFWB26sNSZIkSXVQaetpZt4B3NHw/hng2OoikiRJUl1U9RPAv8jM4yPieoqf/p1BZu5aQViSJEmqkapadH9X/j+zouVLkiSp5qr6CeBR5f872ptWkiRJmhVV34wmSZIkzRYmupIkSaqlyhLdiFgwIn46i/NeEBGvRsQjDcN+GhGPR8RDEXFtRCxVDu8TEe9FxOjyb3g3rYIkSZLmYpUlupk5DdgwImIWZr8I+HyzYX8H1snMAcB/KJ7P2+TpzBxY/h01SwFLkiRpnlL1r5A9APw5Iv4IvNM0MDOvaWumzLwzIvo0G3Zzw9t7gb26MU5JkiTNY6pOdJcBJgCfaxiWQJuJbgccClzZ8L5vRDwAvAV8JzPv6mL5kiRJmstV/ctoh3R3mRHxbWAqcGk56CVg5cycEBEbAn+KiP6Z+VYL8x4BHAGw8sord3dokiRJmoMqfepCRPSLiFubbiqLiAER8Z0ulHcwsDMwJDMTIDOnZOaE8vUo4GmgX0vzZ+Z5mTkoMwf17t17VsOQJEnSXKDqx4udT3HT2AcAmfkQsO+sFBQRnwdOBHbNzHcbhveOiAXL16sCqwPPdDFuSZIkzeWq7qO7WGbe1+zBC1PbmykiLgcGA8tFxDjgexQJ8yLA38vy7i2fsLAV8P2ImApMA47KzNe7dS0kSZI016k60X0tIlajuAGNiNiLok9tmzJzvxYG/7aVaa8Gru5KkJIkSZr3VJ3oHg2cB6wZES8AzwJDqg1JkiRJdVD1UxeeAbaNiMWBBTJzUpXxSJIkqT6qfurC0xFxKXAgsFKVsUiSJKleqn7qwtrA/wHLAmdGxDMRcW3FMUmSJKkGqk50p1E8Wmwa8CHwCvBqpRFJkiSpFqq+Ge0t4GHg58D5TT/sIEmSJHVV1S26+wF3Al8FroiI0yJim4pjkiRJUg1U/dSFPwN/jog1gR2B44H/ARatMi5JkiTN+6p+6sLVEfE0cDawBHAQsHSVMUmSJKkequ6jewZwf2ZOqzgOSZIk1UzVie5o4OiI2Kp8fwcwPDM/qC4kSZIk1UHVie5vgB7Ar8v3B5bDDqssIkmSJNVC1YnuRpm5XsP72yLiwcqikSRJUm1U/XixaRGxWtObiFiV4scjJEmSpC6pukX3W8DtEfEMEMAqwCHVhiRJkqQ6qPo5urdGxOrAGhSJ7uOZOaXKmCRJklQPlSS6EbFHK6NWiwgy85o5GpAkSZJqp6oW3V3aGJeAia4kSZK6pJJENzPthytJkqTZquqnLkiSJEmzhYmuJEmSaslEV5IkSbU0VyW6ETEoIlaoOg5JkiTN++aqRBf4GvCXiLiy6kAkSZI0b6v6l9FmkJkHA0REr6pjkSRJ0ryt0hbdiNg8IhYvXx8QET+PiFUyc1KVcUmSJGneV3XXhd8A70bEesD/AM8Bl1QbkiRJkuqg6kR3amYmsBtwdmaeDdhtQZIkSV1WdR/dSRFxMnAAsFVELAj0qDgmSZIk1UDVLbr7AFOAL2fmy8AKwE+rDUmSJEl1UGmLbpnc/rzh/X+xj64kSZK6QSWJbkRMArK18Zn5sTkYjiRJkmqokkQ3M3sBRMT3gZeB3wEBDMGb0SRJktQNqu6ju0Nm/jozJ2XmW5n5G2DPimOSJElSDVSd6E6LiCERsWBELBARQ4BpFcckSZKkGqg60d0f+BLwSvm3dzlMkiRJ6pKqn7owluLHIiRJkqRuVWmiGxG9gcOBPo2xZOahVcUkSZKkeqj6l9H+DNwF3IJ9cyVJktSNqk50F8vMEyuOQZIkSTVU9c1of4mIL3R2poi4ICJejYhHGoYtExF/j4gny/9LN4w7OSKeiognImKH7gpekiRJc6+qE93jKJLdyRExqfx7qwPzXQR8vtmwk4BbM3N14NbyPRGxNrAv0L+c59cRsWB3rYAkSZLmTpUmupnZKzMXyMye5eteHfn538y8E3i92eDdgIvL1xcDX2wYfkVmTsnMZ4GngI27Zw0kSZI0t6q6jy4RsSuwVfl2RGb+ZRaL+kRmvgSQmS9FxMfL4SsA9zZMN64cJkmSpBqrtEU3Is6g6L7waPl3XDmsWxfTwrBsJZ4jImJkRIwcP358N4chSZKkOanqPrpfALbLzAsy8wKKPrSdvjmt9EpEfAqg/P9qOXwcsFLDdCsCL7ZUQGael5mDMnNQ7969ZzEMSZIkzQ2qTnQBlmp4vWQXyrkOOLh8fTDFM3qbhu8bEYtERF9gdeC+LixHkiRJ84Cq++j+CHggIm6n6GKwFXByezNFxOXAYGC5iBgHfA84A/hDRHwZ+C+wN0BmjomIP1B0jZgKHJ2Z/jiFJElSzVWa6Gbm5RExAtiIItE9MTNf7sB8+7UyaptWpj8dOH1W45QkSdK8p+qb0XYH3s3M6zLzz8DkiPhilTFJkiSpHqruo/u9zHyz6U1mTqTohiBJkiR1SdWJbkvLr7rfsCRJkmqg6kR3ZET8PCJWi4hVI+IsYFTFMUmSJKkGqk50vwa8D1wJ/AF4Dzi60ogkSZJUC1U/deEd4KSIWCIz364yFkmSJNVL1U9d2Cwimn7+l4hYLyJ+XWVMkiRJqoequy6cBewATADIzAcpfjRCkiRJ6pKqE10y8/lmg/zVMkmSJHVZ1Y/yej4iNgMyIhYGjgUeqzgmSZIk1UDVLbpHUTxlYQVgHDAQn7ogSZKkblD1UxdeA4ZUGYMkSZLqqeqnLvwkIj4WET0i4taIeC0iDqgyJkmSJNVD1V0Xts/Mt4CdKbou9AO+VW1IkiRJqoOqE90e5f8vAJdn5utVBiNJkqT6qPqpC9dHxOMUP/371YjoDUyuOCZJkiTVQKUtupl5ErApMCgzPwDeBXarMiZJkiTVQyWJbkRs0fQ6M9/IzGnl63cy8+XyBrV1qohNkiRJ9VBV14U9I+InwE3AKGA80BP4NPBZYBXgGxXFJkmSpBqoJNHNzBMiYmlgL2Bv4FMU/XQfA/4vM++uIi5JkiTVR2U3o2XmG8D55Z8kSZLUrap+vJgkSZI0W5joSpIkqZZMdCVJklRLlSa6EbFYRHw3Is4v368eETtXGZMkSZLqoeoW3QuBKRQ/GgEwDvhBdeFIkiSpLqpOdFfLzJ8AHwBk5ntAVBuSJEmS6qDqRPf9iFgUSICIWI2ihVeSJEnqksqeo1v6HsWvo60UEZcCmwNDK41IkiRJtVBpopuZf4+I+4FNKLosHJeZr1UZkyRJkuqh6q4LACsACwILA1tFxB4VxyNJkqQaqLRFNyIuAAYAY4APy8EJXFNZUJIkSaqFqvvobpKZa1ccgyRJkmqo6q4L90SEia4kSZK6XdUtuhdTJLsvUzxWLIDMzAHVhiVJkqR5XdWJ7gXAgcDDfNRHV5IkSeqyqhPd/2bmdRXHIEmSpBqqOtF9PCIuA66n4RfRMtOnLkiSJKlLqk50F6VIcLdvGObjxSRJktRlVf8y2iFVLl+SJEn1VUmiGxH/k5k/iYhfUrTgziAzj53FctcArmwYtCpwKrAUcDgwvhx+SmbeOCvLkCRJ0ryhqhbdx8r/I7uz0Mx8AhgIEBELAi8A1wKHAGdl5pnduTxJkiTNvSpJdDPz+vLlu5n5x8ZxEbF3Ny1mG+DpzHwuIrqpSEmSJM0rqv5ltJM7OGxW7Atc3vD+mIh4KCIuiIilu2kZkiRJmktV1Ud3R+ALwAoRcU7DqI8BU7uh/IWBXfkoaf4N8L8U/YH/F/gZcGgL8x0BHAGw8sordzUMSZIkVaiqFt0XKfrnTgZGNfxdB+zQDeXvCNyfma8AZOYrmTktMz8Ezgc2bmmmzDwvMwdl5qDevXt3QxiSJEmqSlV9dB8EHoyIyzLzg9mwiP1o6LYQEZ/KzJfKt7sDj8yGZUqSJGkuUvVzdLs9yY2IxYDtgCMbBv8kIgZSdF0Y22ycJEmSaqjqX0brdpn5LrBss2EHVhSOJEmSKlL1UxckSZKk2aLSFt2I6Ad8C1ilMZbM/FxlQUmSJKkWqu668EdgOMWTEKZVHIskSZJqpOpEd2pm/qbiGCRJklRDVf1gxDLly+sj4qvAtcCUpvGZ+XoVcUmSJKk+qmrRHUXxqK8o33+rYVwCq87xiCRJklQrVf1gRN8qlitJkqT5R6WPF4uIoyNiqYb3S5ddGSRJkqQuqfo5uodn5sSmN5n5BnB4deFIkiSpLqpOdBeIiKZ+ukTEgsDCFcYjSZKkmqj68WJ/A/4QEcMpbkI7Crip2pAkSZJUB1UnuicCRwJfoXgCw83A/6s0IkmSJNVCpYluZn4YEb8F7qZo0X0iM/2FNEmSJHVZpYluRAwGLgbGUrTorhQRB2fmnRWGJUmSpBqouuvCz4DtM/MJgIjoB1wObFhpVJIkSZrnVf3UhR5NSS5AZv4H6FFhPJIkSaqJqlt0R5Z9dH9Xvh9C8fPAkiRJUpdUneh+BTgaOJaij+6dwK8rjUiSJEm1UPVTF6ZExK+AW4EPKZ668H6VMUmSJKkeqn7qwk7AcOBpihbdvhFxZGb+tcq4JEmSNO+ruuvCz4DPZuZTABGxGnADYKIrSZKkLqn6qQuvNiW5pWeAV6sKRpIkSfVRdYvumIi4EfgDxS+j7Q38OyL2AMjMa6oMTpIkSfOuqhPdnsArwNbl+/HAMsAuFImvia4kSZJmSdVPXTikyuVLkiSpvirtoxsR/SLi1oh4pHw/ICK+U2VMkiRJqoeqb0Y7HzgZ+AAgMx8C9q00IkmSJNVC1YnuYpl5X7NhUyuJRJIkSbVSdaL7Wvns3ASIiL2Al6oNSZIkSXVQ9VMXjgbOA9aMiBeAZ4Eh1YYkSZKkOqj6qQvPANtGxOLAApk5qcp4JEmSVB9Vt+gCkJnvVB2DJEmS6qXqPrqSJEnSbGGiK0mSpFqqvOtCRGwG9KEhlsy8pLKAJEmSVAuVJroR8TtgNWA0MK0cnICJriRJkrqk6hbdQcDamZkVxyFJkqSaqbqP7iPAJyuOQZIkSTVUdYvucsCjEXEfMKVpYGbuWl1IkiRJqoOqE91hFS9fkiRJNVX1L6Pd0d1lRsRYYBLFzW1TM3NQRCwDXEnxdIexwJcy843uXrYkSZLmHpX20Y2ITSLi3xHxdkS8HxHTIuKtbij6s5k5MDMHle9PAm7NzNWBW8v3kiRJqrGqb0b7FbAf8CSwKHBYOay77QZcXL6+GPjibFiGJEmS5iJVJ7pk5lPAgpk5LTMvBAZ3tUjg5ogYFRFHlMM+kZkvlct7Cfh4F5chSZKkuVzVN6O9GxELA6Mj4ifAS8DiXSxz88x8MSI+Dvw9Ih7v6IxlYnwEwMorr9zFMCRJklSlqlt0DyxjOAZ4B1gJ2LMrBWbmi+X/V4FrgY2BVyLiUwDl/1dbmfe8zByUmYN69+7dlTAkSZJUsUoT3cx8DgjgU5l5WmZ+vezKMEsiYvGI6NX0Gtie4kcprgMOLic7GPhz1yKXJEnS3K7qpy7sAowGbirfD4yI67pQ5CeAuyPiQeA+4IbMvAk4A9guIp4EtivfS5Ikqcaq7qM7jKJrwQiAzBwdEX1mtbDMfAZYr4XhE4BtZrVcSZIkzXuq7qM7NTPfrDgGSZIk1VDVLbqPRMT+wIIRsTpwLPDPimOSJElSDVTdovs1oD8wBbgceAs4vsqAJEmSVA+Vtuhm5rvAt8s/SZIkqdtUkui292SFzNx1TsUiSZKkeqqqRXdT4HmK7gr/oniWriRJktRtqkp0P0nxPNv9gP2BG4DLM3NMRfFIkiSpZiq5GS0zp2XmTZl5MLAJ8BQwIiK+VkU8kiRJqp/KbkaLiEWAnShadfsA5wDXVBWPJEmS6qWqm9EuBtYB/gqclpmPVBGHJEmS6quqFt0DgXeAfsCxEdPvRQsgM/NjFcUlSZKkmqgk0c3Mqn+oQpIkSTVnwilJkqRaMtGVJElSLZnoSpIkqZZMdCVJklRLJrqSJEmqJRNdSZIk1ZKJriRJkmrJRFeSJEm1ZKIrSZKkWjLRlSRJUi2Z6EqSJKmWTHQlSZJUSya6kiRJqiUTXUmSJNWSia4kSZJqyURXkiRJtWSiK0mSpFoy0ZUkSVItmehKkiSplkx0JUmSVEsmupIkSaolE11JkiTVkomuJEmSaslEV5IkSbVkoitJkqRaMtGVJElSLZnoSpIkqZZMdCVJklRLtUp0I2KliLg9Ih6LiDERcVw5fFhEvBARo8u/L1QdqyRJkmavhaoOoJtNBb6RmfdHRC9gVET8vRx3VmaeWWFskiRJmoNqlehm5kvAS+XrSRHxGLBCtVFJkiSpCrXqutAoIvoA6wP/KgcdExEPRcQFEbF0dZFJkiRpTqhlohsRSwBXA8dn5lvAb4DVgIEULb4/a2W+IyJiZESMHD9+/JwKV5IkSbNB7RLdiOhBkeRempnXAGTmK5k5LTM/BM4HNm5p3sw8LzMHZeag3r17z7mgJUmS1O1qlehGRAC/BR7LzJ83DP9Uw2S7A4/M6dgkSZI0Z9XqZjRgc+BA4OGIGF0OOwXYLyIGAgmMBY6sIjhJkiTNObVKdDPzbiBaGHXjnI5FkiRJ1apV1wVJkiSpiYmuJEmSaslEV5IkSbVkoitJkqRaMtGVJElSLZnoSpIkqZZMdCVJklRLJrqSJEmqJRNdSZIk1ZKJriRJkmrJRFeSJEm1ZKIrSZKkWjLRlSRJUi2Z6EqSJKmWTHQlSZJUSya6kiRJqiUTXUmSJNWSia4kSZJqaaGqA5AkzT/6nHRD1SHMt8aesVPVIUhznC26kiRJqiUTXUmSJNWSXRckSZofDFuy6gjmX8PerDqC+ZYtupIkSaolE11JkiTVkomuJEmSaslEV5IkSbVkoitJkqRaMtGVJElSLZnoSpIkqZZMdCVJklRLJrqSJEmqJRNdSZIk1ZKJriRJkmrJRFeSJEm1ZKIrSZKkWjLRlSRJUi2Z6EqSJKmWTHQlSZJUSya6kiRJqiUTXUmSJNXSfJXoRsTnI+KJiHgqIk6qOh5JkiTNPvNNohsRCwLnAjsCawP7RcTa1UYlSZKk2WW+SXSBjYGnMvOZzHwfuALYreKYJEmSNJvMT4nuCsDzDe/HlcMkSZJUQwtVHcAcFC0MyxkmiDgCOKJ8+3ZEPDHbo1JlApYDXqs6jvnSaS0djpJmJ+u8Cs2ZOm+VObGQec38lOiOA1ZqeL8i8GLjBJl5HnDenAxK1YmIkZk5qOo4JGlOsM7T/Gh+6rrwb2D1iOgbEQsD+wLXVRyTJEmSZpP5pkU3M6dGxDHA34AFgQsyc0zFYUmSJGk2mW8SXYDMvBG4seo4NNewm4qk+Yl1nuY7kZntTyVJkiTNY+anPrqSJEmaj8xXXRc094qIZYFby7efBKYB48v3G5c/8tHavIOAgzLz2E4sbywwqVwOwJ2dmb8D5b+dmUt0V3mS6qkrdV85/2Dg/cz8ZwvjhgI/BV5oGLx/Zj7atainlz8MeDszz+yO8qTZwURXc4XMnAAMhJYrz4hYKDOntjLvSGDkLCz2s5npMyUlVaa9uq8DBgNvAzMluqUrM/OYLoQozdPsuqC5VkRcFBE/j4jbgR9HxMYR8c+IeKD8v0Y53eCI+Ev5elhEXBARIyLimYjoVCttOd8vyvIfiYiNy+HLRMSfIuKhiLg3IgaUw5eIiAsj4uFy3J4NZZ0eEQ+W03+i2zaMpFqLiA0j4o6IGBURf4uIT5XDj42IR8u65oqI6AMcBZwQEaMjYssOlj84Iu6MiGvL8oZHxALluP3K+uyRiPhxwzyfj4j7yzrt1obi1p7V+laaE2zR1dyuH7BtZk6LiI8BW5WPitsW+CGwZwvzrAl8FugFPBERv8nMD1qY7vaIaOq6cHFmnlW+XjwzN4uIrYALgHWA04AHMvOLEfE54BKKVpjvAm9m5roAEbF0UxnAvZn57Yj4CXA48IOubAhJ84UAfgnslpnjI2If4HTgUOAkoG9mTomIpTJzYkQMp+1W4H0iYouG95uW/zcG1gaeA24C9oiIfwI/BjYE3gBujogvAv8Azqeof5+NiGUayutofStVwkRXc7s/ZmZTMrokcHFErE7x8809WpnnhsycAkyJiFeBT1D8Ml5zrXVduBwgM++MiI9FxFLAFpRJdWbeFhHLRsSSwLYUPz5COe6N8uX7wF/K16OA7Tq0tpLmd4tQfLn+e0RA8dz3l8pxDwGXRsSfgD91sLyZui6U5d6Xmc+U7y+nqOM+AEZk5vhy+KXAVhT9hu/MzGcBMvP1huI6Wt9KlTDR1dzunYbX/wvcnpm7l5fsRrQyz5SG19Po/H7e/Jl7SdHK0tJ00cL0AB/kR8/um5UYJM2fAhiTmZu2MG4nisRzV+C7EdG/C8vpaD3XFFNrzyLtan0rzVb20dW8ZEk+unt46Gxczj4A5eW+NzPzTeBOYEg5fDDwWma+BdwMTG8taei6IEmzYgrQOyI2BYiIHhHRv+xDu1Jm3g78D7AUsATF02N6zcJyNo6IvmW5+wB3A/8Cto6I5SJiQWA/4A7gnnJ43zKmZVorVJrbmOhqXvIT4EcR8Q+Ky3lddXt5A8foiLikYfgbZV+14cCXy2HDgEER8RBwBnBwOfwHwNLljRsPUvRVk6RZ9SGwF8UNuA8Co4HNKOq830fEw8ADwFmZORG4Hti9jZvR9mmo50ZHxGbl8Hso6rJHgGeBazPzJeBk4HbgQeD+zPxz2ZXhCOCaMqYrZ8uaS7OBv4wmNYiIEcA3y0eWSVLtlFelvpmZO1ccijTb2aIrSZKkWrJFV5IkSbVki64kSZJqyURXkiRJtWSiK0mSpFoy0ZUkSVItmehKkiSplkx0JUmSVEv/H5PyfAcOfleQAAAAAElFTkSuQmCC",
      "text/plain": [
       "<Figure size 720x504 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Create a directory to save the multi-run figure to \n",
    "os.makedirs(\"pytorch_2_results/figures\", exist_ok=True)\n",
    "\n",
    "# Create a path to save the figure for multiple runs\n",
    "save_path_multi_run = f\"pytorch_2_results/figures/multi_run_{GPU_NAME}_{MODEL_NAME}_{DATASET_NAME}_{IMAGE_SIZE}_train_epoch_time.png\"\n",
    "\n",
    "# Plot the mean epoch times for experiment 3 and 4\n",
    "plot_mean_epoch_times(non_compiled_results=non_compile_results_multiple_runs_df, \n",
    "                      compiled_results=compile_results_multiple_runs_df, \n",
    "                      multi_runs=True, \n",
    "                      num_runs=NUM_RUNS, \n",
    "                      save_path=save_path_multi_run, \n",
    "                      save=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Nice! \n",
    "\n",
    "Looks like the compiled model edges out the non-compiled model across multiple runs.\n",
    "\n",
    "This is likely because on a single run (with a low amount of epochs), the compiling of the model takes quite a bit of time for the first epoch to run.\n",
    "\n",
    "However, when the model has already been compiled and starts training for longer, the speedups from the behind the scenes optimizations start to show.\n",
    "\n",
    "A possible extension would be to let the model train for a longer time, say 100 epochs, and see how the results compare."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "### 4.4 Save multi run results to file with GPU details\n",
    "\n",
    "Let's also save our results dataframes for experiments 3 and 4 to file in case we'd like to inspect them later or compare them to other kinds of models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INFO] Saving experiment 3 non-compiled results to: pytorch_2_results/multi_run_results/single_run_non_compiled_results_CIFAR10_ResNet50_NVIDIA_TITAN_RTX.csv\n",
      "[INFO] Saving experiment 4 compiled results to: pytorch_2_results/multi_run_results/single_run_compiled_results_CIFAR10_ResNet50_NVIDIA_TITAN_RTX.csv\n"
     ]
    }
   ],
   "source": [
    "# Make a directory for multi_run results\n",
    "import os\n",
    "pytorch_2_results_dir = \"pytorch_2_results\"\n",
    "pytorch_2_multi_run_results_dir = f\"{pytorch_2_results_dir}/multi_run_results\"\n",
    "os.makedirs(pytorch_2_multi_run_results_dir, exist_ok=True)\n",
    "\n",
    "# Create filenames for each of the dataframes\n",
    "save_name_for_multi_run_non_compiled_results = f\"multi_run_non_compiled_results_{NUM_RUNS}_runs_{DATASET_NAME}_{MODEL_NAME}_{GPU_NAME}.csv\"\n",
    "save_name_for_multi_run_compiled_results = f\"multi_run_compiled_results_{NUM_RUNS}_runs_{DATASET_NAME}_{MODEL_NAME}_{GPU_NAME}.csv\"\n",
    "\n",
    "# Create filepaths to save the results to\n",
    "multi_run_no_compile_save_path = f\"{pytorch_2_multi_run_results_dir}/{save_name_for_non_compiled_results}\"\n",
    "multi_run_compile_save_path = f\"{pytorch_2_multi_run_results_dir}/{save_name_for_compiled_results}\"\n",
    "print(f\"[INFO] Saving experiment 3 non-compiled results to: {multi_run_no_compile_save_path}\")\n",
    "print(f\"[INFO] Saving experiment 4 compiled results to: {multi_run_compile_save_path}\")\n",
    "\n",
    "# Save the results\n",
    "non_compile_results_multiple_runs_df.to_csv(multi_run_no_compile_save_path)\n",
    "compile_results_multiple_runs_df.to_csv(multi_run_compile_save_path)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Possible improvements and extensions\n",
    "\n",
    "We've explored the fundamentals of `torch.compile()` and wrote code for several experiments to test how it performs.\n",
    "\n",
    "But there's still more we could do.\n",
    "\n",
    "As we've discussed, many of the speedups in PyTorch 2.0 and `torch.compile()` come from using newer GPUs (e.g. A100 and above) and using as much of the GPU as possible (larger batch sizes, larger model sizes).\n",
    "\n",
    "For even more speedups, I'd recommend researching/trying the following:\n",
    "\n",
    "* **More powerful CPUs** - I have a sneaking suspicion that Google Colab instances are limited to 2 CPU cores, speedup numbers could be improved with more CPUs. This could be tracked via the [PyTorch Profiler](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html) (a tool to find what processes take what time).\n",
    "* **Using mixed precision training** - newer GPUs have the ability to handle different precision types (e.g. [`torch.float16`](https://pytorch.org/docs/stable/tensors.html#data-types) and [`torch.bfloat16`](https://pytorch.org/docs/stable/generated/torch.Tensor.bfloat16.html)) which enable faster training and inference. I'd suspect you'll see an even larger speedup than we've seen here by using mixed precision training. For more on this, see the [PyTorch documentation for automatic mixed precision](https://pytorch.org/docs/stable/notes/amp_examples.html#amp-examples) (also called AMP) with PyTorch. \n",
    "* **Transformer based models may see more *relative* speedups than convolutional models** - PyTorch 2.0 includes a [stable release for accelerated transformer models](https://pytorch.org/blog/pytorch-2.0-release/#stable-accelerated-pytorch-2-transformers) (models which use the attention mechanism). The main speedups come from an improved implementation of [`scaled_dot_product_attention()`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html?highlight=scaled_dot_product#torch.nn.functional.scaled_dot_product_attention) which automatically selects the best version of attention to use based on the hardware you're computing on. You can see more in the [dedicated PyTorch tutorial](https://pytorch.org/tutorials/intermediate/scaled_dot_product_attention_tutorial.html). \n",
    "* **Train for longer** - As previously discussed, the speedups from `torch.compile()` are likely to be more noticeable when training for longer. A great exercise would be to train over a longer number of epochs, potentially on a different dataset with a different model (e.g. a transformer) and see how the speedups compare."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Resources to learn more\n",
    "\n",
    "I've found the following resources to be helpful learning about PyTorch 2.0 and it's upcoming features.\n",
    "\n",
    "* [PyTorch 2.0 launch blog post](https://pytorch.org/get-started/pytorch-2.0/). \n",
    "* [PyTorch 2.0 release notes](https://pytorch.org/blog/pytorch-2.0-release/) (blog post).\n",
    "    * As well as the [GitHub release notes](https://github.com/pytorch/pytorch/releases/tag/v2.0.0) (lots of info here!).\n",
    "* [PyTorch default device context manager docs](https://github.com/pytorch/tutorials/pull/2220/files).\n",
    "* [PyTorch 2.0 video introduction on YouTube](https://youtu.be/WqLKfta5Ijw) (created by yours truly).\n",
    "* See a [tip by Sebastian Raschka](https://twitter.com/rasbt/status/1638297626385719297?s=20) to improve `torch.compile()` by performing an example batch first (warm-up the model) before continuing with further training (this explains the increased speedups with multiple runs)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.4"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "03bc13acfc4e8139fb32f411c6712485d4605f3bdd6569f6973c62d6adcc8291"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
